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Project  Summary 

Human  discrimination  of  complex  acoustic  signals  typically  cannot  be  predicted 
from  the  simple  sum  of  the  discriminabilities  associated  with  individual  components 
of  the  signal.  Understanding  such  failures  of  additivity  is  central  to  our 
understanding  of  complex  sound  perception.  The  goal  of  this  project  is  to 
elucidate  the  rules  and  mechanisms  whereby  individual  stimulus  components 
combine  to  influence  the  detection  and  discrimination  of  complex  sounds.  The 
project  is  designed  to  answer  specific  questions  regarding  listeners’  ability  to 
integrate  information  within  and  across  stimulus  dimensions,  to  extract  information 
contained  in  the  pattern  of  the  acoustic  signal,  and  to  perform  under  conditions  of 
stimulus  uncertainty.  The  data  are  also  used  to  determine  how  listeners  weight  the 
information  provided  by  different  components  of  the  signal,  and  how  best  to 
package  the  acoustic  information  in  frequency  and/or  time  so  that  it  is  processed 
most  effectively  by  the  listener.  Finally,  work  is  undertaken  to  develop  a 
computational  model  to  summarize  and  predict  the  results  of  these  and  future 
experiments. 
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Statement  of  Work/Research  Objectives 

Can  the  perception  of  a  complex  event  be  reduced  to  the  sum  of  its 
analyzable  elements?  This  was  one  of  the  fundamental  questions  that  occupied  the 
minds  of  the  earliest  thinkers  interested  in  understanding  human  perception. 
Today,  of  course,  we  are  familiar  with  the  Gestaltist’s  favorite  illusions 
demonstrating  that  the  perception  of  the  whole  is  often  greater  than  the  sum  of  its 
separate  parts.  By  demonstrating  the  importance  of  the  relations  among  parts, 
the  Gestalt  psychologist  redefined  the  study  of  perception  as  the  study  of  patterns. 

In  contemporary  psychoacoustics,  the  Gestaltist’s  influence  has  been  made 
evident  in  pattern  perception  models  of  pitch  (Goldstein,  1973;  Terhardt,  1974; 
Wightman,  1973),  localization  (Searle,  1982;  Perkins,  Kistler  and  Wightman,  1986), 
and  speech  (Stevens  and  Blumstein,  1978).  Now  there  is  evidence  that  simple 
auditory  detection,  as  well,  frequently  involves  an  analysis  of  the  overall  pattern  of 
excitation  produced  by  the  signal  and  masker  (Ahumada  and  Lovell,  1971; 
Ahumada,  Marken,  and  Sandusky, 1975;  Green,  1983;  Green,  and  Kidd,  1983;  Green, 
and  Mason,  1985;  Hall,  Haggard,  and  Fernandes,  1984;  Hanna,  1984;  Leek,  and 
Watson,  1984;  Lutfi,  1985,  1986;  Spiegel,  Picardi,  and  Green,  1981).  The  basic 
result  of  the  detection  studies  is  a  failure  of  additivity;  components  of  the  acoustic 
complex  affect  threshold  in  ways  that  are  not  predicted  by  summing  their  separate 
effects.  Failures  of  additivity  impose  severe  constraints  on  our  ability  to  predict  the 
auditory  system’s  response  to  complex  stimuli,  like  speech,  from  the  response  to 
much  simpler  inputs.  Thus,  one  of  the  greatest  challenges  confronting 
psychoacoustics  in  the  years  ahead  is  to  understand  the  mechanisms  and 
invariances  that  determine  how  stimulus  components  combine  to  influence  auditory 
perception. 
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The  present  project  adopts  an  approach  to  this  problem  which  is  both  simple 
and  direct.  In  all  experiments,  the  unit  of  analysis  is  the  discriminability,  as 
measured  by  d’,  of  single  tone  bursts  that  differ  (on  average)  in  level.  The 
complex  signals  of  these  experiments  are  comprised  of  various  combinations  of  2  to 
13  of  these  tone  bursts  distributed  in  frequency  and/or  time.  On  the  basis  of 
simple  additivity,  the  discriminability  of  the  complex  is  given  by  the  vector 
summation  rule,  d’co  plex  =  (Ed’.2)1'2,  where  d’.  is  the  discriminability  of  the  ith 
tone  component  of  tfie  complex.  '  The  vector  summation  rule  thus  provides  the 
referent  for  evaluating  the  discriminability  actually  obtained.  This  simple  approach 
is  used  to  address  the  following  specific  questions  regarding  the  processing  of 
complex  sounds: 

(1)  How  efficiently  can  human  observers  integrate  information  within  and  across 
different  stimulus  dimensions? 

(2)  What  effects  do  varying  degrees  of  stimulus  uncertainty  along  relevant  and 
irrelevant  dimensions  have  on  thi  ability  to  integrate  this  information? 

(3)  How  efficiently  can  observers  extract  information  contained  in  the  pattern  of 
level  variation  across  the  individual  components  of  the  complex? 

(4)  Which  components  of  the  complex  are  weighted  most  heavily  in  the  decision 
process? 

(5)  What  is  the  best  way  to  package  the  acoustic  information  in  frequency  and/or 
time  so  that  it  will  be  processed  most  effectively  by  the  observer? 

(6)  What  are  the  mechanisms  underlying  the  discrimination  of  these  complex 
sounds?  Can  a  computational  model  be  developed  to  account  for  the  results? 

Research  Progress 
Study  1:  Magnitude  Analysis 

This  early  experiment  was  designed  to  address  two  questions:  How  efficiently 
is  information  combined  across  frequency  channels,  and  what  effect  does  spectral 
uncertainty  have  on  the  ability  *o  combine  this  information?  The  stimuli  were  n- 
tone  complexes,  where  n  ranged  from  1  to  13.  The  frequencies  of  the  tones  were 
spaced  at  equilog  intervals  from  250  to  4000  Hz.  Fig.  1A  shows  an  example  of 
one  of  these  complexes  where  n  is  10.  In  this  experiment,  the  tones  were  added 
from  low  frequencies  to  high  as  n  was  increased  (lo-pass  condition).  All  tone 
complexes  were  played  over  16-bit,  audio-quality,  D-to-A  converters  at  a  20-kHz 
rate.  The  complexes  were  gated  on  and  off  with  5-ms,  cosine-squared  ramps  for  a 
total  duration  (from  0  voltage  points)  of  100  ms.  On  each  interval  of  a  two- 
interval,  forced-choice  trial,  the  individual  intensities  of  the  tones  in  the  complex 
comprised  a  random  sample  of  size  n  from  one  of  two  log-normal  distributions: 
LOW  (M,  =  65  dB,  o,  =  5  dB)  and  HIGH  (Mh  =  70  dB,  o  =  5  dB).  The 
value  of  n  was  fixed  for  each  block  of  trials.  The  listener’s  task  was  to  identify 
which  interval  contained  the  complex  drawn  from  the  HIGH  distribution. 
Feedback  was  given  after  each  trial.  The  recording  of  trial  by  trial  data,  the 
generation  of  stimuli,  and  all  other  experimental  events  were  controlled  by  an  IBM 
AT  computer. 

According  to  the  Theory  cf  Signal  Detection,  optimal  performance  for  this 
task  as  measured  by  d’  grows  as  the  square  root  of  n.  Specifically,  d’  t  = 
n  *  A  /  <7 ,  where  A  =  Mh-M,,  and  oh  =  <7,  =  a.  Optimal  performance  is  the 
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FIG  1.  Examples  of  idealized  stimulus  spectra  used  in  three  different  experiments, 
(A)  Magnitude  analysis  experiment,  (B)  Pattern  analysis  experiment,  (C) 
Information  reliability  experiment.  See  text  for  further  details. 
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FIG  2.  Integration  functions 
from  pilot  experiment.  Symbols 
represent  the  averages  of  three 
subjects,  over  1000  trials  per 
subject;  <r=5  dB  (circles),  <7  =  10 
dB  (triangles).  Solid  line  is 
performance  of  an  ideal  detector. 
Dashed  line  is  prediction  of  a 
model.  See  text  for  further 
details. 


FIG  3.  Psychometric  functions  from  pilot  experiment.  Each  panel  gives  data  from 
a  different  subject.  Plotted  along  the  abscissa  is  the  difference  between  the  overall 
level  of  the  two  stimuli  in  each  trial.  Different  symbols  represent  different  n. 


FIG  4.  Weighting  functions  from 
pilot  experiment.  Horizontal 
dashed  line  is  percent  agreement 
for  n  =  l.  Other  curves  are 
percent  agreement  for  n>l. 
Average  of  three  subjects.  See 
text  for  further  details. 
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referent  whereby  we  can  determine  how  efficiently  our  listeners  are  able  to  make 
use  of  the  information  provided  by  the  different  tones  of  the  complex.  Also,  the 
value  of  a  provides  an  index  of  the  degree  of  spectral  uncertainty  associated  with 
the  task  -  the  greater  the  value  of  a,  the  greater  the  amount  of  uncertainty.  To 
measure  the  effects  of  uncertainty  on  listeners’  ability  to  combine  information,  we 
simply  vary  a  while  being  sure  at  the  same  time  to  adjust  n  or  A  so  as  to 
maintain  the  same  level  of  performance  from  an  ideal  detector. 

Fig.  2  shows  the  results  of  this  experiment.  The  solid  curve  represents 
optimal  performance  for  the  task  as  predicted  by  TSD.  The  circles  represent  the 
average  performance  of  three  listeners,  over  1000  trials  per  listener.  The  triangles 
represent  the  average  performance  of  these  same  3  listeners  when  both  a  and  A 
were  changed  from  5  to  10  dB  constant  d’o  ).  The  dashed  line  is  the  prediction 
of  a  model  which  will  be  descr.bed  later.  For  future  reference,  we  will  refer  to 
curves  drawn  in  these  coordinates  as  integration  functions.  First,  note  that  the 
listeners’  ability  to  integrate  information  across  frequencies  is  less  than  optimal. 
Whereas  optimal  performance  grows  at  the  square  root  of  n,  obtained  performance 
grows  more  nearly  as  the  cube  root  of  n  (cube  root  of  n  growth  is  the  dashed 
line).  What  is  intriguing  about  this  result  is  that  even  for  small  n  listeners  make 
so  little  use  of  the  information  available  in  the  stimulus.  Based  on  numerous 
studies  of  the  information  processing  capacity  of  humans,  we  had  expected  that  at 
least  7  give  or  take  2  components  would  have  been  processed  optimally  (Miller, 
1963).  The  suboptimal  performance  cannot  be  attributed  to  a  ceiling  effect  -  we 
have  since  replicated  the  cube  root  of  n  growth  in  performance  at  a  lower  level  of 
d’opt.  It  is  also  unlikely  that  the  tones  were  masking  one  another  -  even  with  a 
half-octave  separation  between  the  tones  performance  is  unchanged.  Finally,  the 
results  cannot  be  attributed  to  a  simple  lack  of  training.  Most  of  subjects  are 
practiced  musicians,  some  have  been  participating  in  these  experiments  for  over  a 
year  now  with  little  or  no  observable  improvement  in  performance. 

Another  intriguing  result  is  that  increasing  the  level  of  stimulus  uncertainty  (a 
of  5  versus  10  dB)  has  no  effect.  We  had  expected  that,  in  general,  higher  leveL 
of  uncertainty  would  produce  poorer  performance  as  Watson  and  many  others  have 
found.  In  fact,  a  a  of  5  and  10  dB  represents  a  fairly  wide  range  of  stimulus 
variability.  The  apparent  discrepancy  appears  to  be  related  to  the  fact  that  in 
Watson’s  experiments,  unlike  ours,  there  is  no  variation  in  the  difference  (level, 
frequency,  or  duration)  to  be  discriminated;  put  simply,  optimal  performance  for 
Watson’s  task  is  unbounded.  This  may  be  important.  Most  naturally  occuring 
signals  vary,  thus,  even  an  ideal  detector  would  make  errors  on  occasion.  But  this 
is  precisely  the  type  of  stimulus  variability  which  in  our  experiment  has  no  effect. 
Could  it  be  that  the  effects  of  stimulus  uncertainty  are  largely  limited  to  those 
laboratory  conditions  in  which  the  difference  to  be  discriminated  is  fixed,  that  is, 
in  which  an  ideal  detector  makes  no  errors? 

In  this  regard  at  least,  our  task  appears  more  analogus  to  a  traditional  noise 
intensity  discrimination  task  (e.g.  Green,  1965)  than  to  Watson  uncertainty 
experiment.  Our  o  could  be  likened  to  the  o  of  the  sampling  distribution  of  noise 
energies;  our  A  would  be  analogus  to  the  mean  difference  between  noise  energies  to 
be  discriminated.  Of  course,  in  our  experiment,  a  is  generally  much  larger  than  in 
the  noise  discrimination  experiment,  and  the  form  of  the  distributions  are  different 
as  well.  The  important  point  to  note  however  is  that  performance  in  both 
experiments  is  found  to  be  constant  for  a  constant  A/a  ratio,  reflecting  a  type  of 

*At  first,  this  may  seem  inconsistent  with  the  results  of  earlier  studies  (e.g.  Green,  1960)  showing 
square  root  of  n  growth  for  intensity  discrimination  of  noise  signals  (where  n  refers  to  signal 
bandwidth).  One  must  remember,  however,  that  in  our  experiment  the  distributions  of  individual 
tone  intensities  are  log-normal,  thus  overall  intensity  discrimination  is  a  suboptimal  strategy. 
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Weber  fraction  in  both  cases.  Experiments  are  currently  underway  to  partial  out 
the  relative  influence  of  peripheral  and  central  factors  on  these  results. 

Preliminary  modelling  efforts 

We  have  now  pursued  several  computational  models  to  account  for  the  results 
of  this  pilot  experiment.  Although  these  models  have  so  far  only  been  applied  to 
the  data  of  this  experiment,  they  could  in  principle  be  applied  to  future  results 
obtained  in  any  of  the  experiments  of  this  proposal.  Each  of  these  models 
attributes  suboptimal  performance  to  a  different  stage  of  auditory  processing.  The 
outstanding  feature  of  these  models  is  that,  despite  their  differences,  they  all 
provide  an  equally  excellent  summary  of  the  preliminary  data,  in  each  case 
accounting  for  92%  or  more  of  the  total  variance.  We  believe  that  future  research 
should  be  largely  guided  by  attempts  to  empirically  test  these  models.  Indeed, 
such  tests  should  eventually  converge  on  a  subset  of  stimulus  conditions  for  which 
optimal  performance  is  both  predicted  and  obtained  —  this  would  provide  the 
litmus  test  of  any  model. 


Table  I.  Models  of  Information  Integration 


Model 

General  Form 

Specific  Form 

Growth  Factor 

Interchannel 

Correlation 

d’n  = 

A/(<r2/n+R)1/2 

R  a  a2 

[n/(l+nR/a2)]1/2 

Compressive 

Nonlinearity 

F(d’„) 

=  EF(d’,) 

F(z)  =  zp 

n1/p 

Limited  Memory 
Capacity 

<*’„  = 

[d’,+E  »id'J)’|,/I 

w.=w  a  constant 

[l+w2(n-l)]1/2 

Nonoptimal 

Difference 

1/3 

approx  n  ' 

Decision  Overall  Level 

Strategy 


Model  1:  Correlated  Observations.  In  our  pilot  experiment,  the  n  elements 
comprising  the  stimulus  sample  are  independent.  The  basic  assumption  of  the 
correlated  observations  model  is  that  the  n  observations  corresponding  to  these 
elements  are  not  independent.  In  effect,  the  model  assumes  that  there  is  a  source 
of  internal  (central)  noise  which  is  common  to  all  observations.  The  general 
formulation  of  this  model  is  given  in  Table  I.  Note  that  the  general  form  is 
identical  to  the  predictions  for  an  ideal  detector  with  the  exception  of  the  variance 
term  R  in  the  denominator.  The  variance  term  R  represents  the  influence  of  the 
central  noise  in  this  model.  In  the  specific  form  of  the  model,  R  is  assumed  to 
grow  with  the  external  variance  <r  (i.e.  the  internal  noise  is  multiplicative).  The 
value  of  R  providing  the  best  fit  to  the  data  is  3.4  dB  (R1'  =1.8  dB)  which  is  in 
reasonably  good  agreement  with  internal  noise  estimates  from  other  types  of 
intensity  discrimination  experiments  (e.g.  Bos  and  DeBoer,  1966;  Durlach,  1963). 

Model  2:  Information  Compression.  This  model  allows  that  all  observations 
are  independent.  However,  it  assumes  all  observations  are  subject  to  some 
nonlinear  transformation  both  before  and  after  they  are  combined.  In  Table  I,  the 
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nonlinear  transformation  is  given  by  F  which  is  assumed  in  this  case  to  be  a 
power-law.  When  the  exponent  p  of  the  power-law  is  2  there  is  no  information 
loss  with  n  and  the  model  predicts  optimal  performance.  For  p  >  2,  there  is  a 
progressive  loss  of  information  provided  by  each  additional  observation.  The  result 
is  a  common  form  of  information  compression  (Hafter  and  Dye,  1983;  Lutfi,  1983; 
Penner,  1978,  1980;  Stevens,  1936).  The  value  of  p  providing  the  best  Fit  to  the 
data  is  3.4.  This  yields  a  compressive  exponent  on  n  of  0.3,  which  again  is  in 
good  agreement  with  values  obtained  in  the  other  types  of  studies  cited  above. 

Model  3:  Limited  Memory  Capacity.  This  model  emphasizes  the  fact  that  on 
each  trial  of  the  two-interval,  forced-choice  task,  the  observer  must  compare  the 
observation  made  on  the  second  interval  with  a  memory  trace  of  the  observation 
made  on  the  First  interval.  The  trace  is  volatile  and  is  assumed  to  deteriorate 
over  time.  In  our  formulation  of  the  model  (Table  I),  performance  is  optimal 
when  the  memory  load  is  only  one  element  (this  assumes  that  the  time  between 
observations  is  small,  say  less  than  a  half  second).  For  each  additional  element, 
only  a  fraction  w^  of  the  information  is  preserved  by  the  time  the  second 
observation  interval  comes  along  for  comparison.  We  find  excellent  Fits  to  the  data 
when  all  w.s  are  assumed  equal  to  0.5. 

Modef  4:  Nonoptimal  Decision  Strategy.  This  model  assumes  no  special 
degradation  or  compression  of  information  before  the  decision  stage.  Rather,  the 
model  assumes  that  performance  is  limited  by  the  listener’s  choice  of  a  nonoptimal 
rule  for  arriving  at  a  decision.  The  optimal  decision  strategy  in  our  pilot 
experiment  begins  by  computing  a  level  difference  between  the  first  and  second 
observation  interval  for  each  element  of  the  stimulus  sample.  The  optimal  strategy 
is  then  to  choose  interval  1  if  the  sum  of  these  differences  is  positive,  otherwise 
choose  interval  2.  We  have  explored  a  number  of  alternative  nonoptimal  decision 
rules  and  have  found  one  in  particular  that  provides  a  very  good  account  of  the 
data.  In  this  nonoptimal  strategy,  decisions  are  based  simply  on  the  overall  level 
difference  between  the  First  and  second  observation  interval  (see  footnote  1).  This 
decision  rule  approximately  yields  a  cube  root  on  n  growth  rate  as  shown  by  the 
dashed  line  in  Fig.  2. 

There  are  several  approaches  that  will  be  taken  to  test  among  these  models. 
Many  tests  will  simply  involve  the  manipulation  of  variables  explicitly  or  implicitly 
deFined  in  the  mathematical  formulations  of  the  models.  These  variables  include 
the  variance  of  the  distribution  of  members  within  each  stimulus  category  (holding 
d’opt  constant),  the  number  of  members,  the  number  of  categories,  the  mathematical 
form  of  the  distributions,  and  the  size  of  the  sample  randomly  drawn  on  each 
trial.  Other  tests  will  involve  various  manipulations  in  stimulus  parameters  and 
various  ways  of  "packaging”  the  information  presented  to  observers.  For  instance, 
the  tones  from  signal  and  nonsignal  distributions  will  be  intermingled  in  frequency 
and  time  in  various  ways  to  form  different  classes  of  spectral-temporal  patterns 
(Study  2).  The  Final  approach  will  be  to  evaluate  the  models  based  on  trial-by¬ 
trial  analyses  of  the  listeners’  responses.  This  latter  approach  is  discussed  in 
greater  detail  below. 

Trial-by-trial  analyses 

Each  of  the  models  we  have  described  makes  a  speciFic  prediction  regarding 
the  mathematical  form  of  the  integration  function.  Unfortunately,  the  differences 
among  these  functions  are  so  3mall  that  they  cannot  be  resolved  within  the 
measurement  error  of  our  experiment.  In  this  situation,  we  resort  to  analyzing  the 
models’  predictions  for  the  trial-by-trial  data. 

Consider  for  example  the  predictions  of  Model  4.  According  to  this  model, 
the  listener  responds  to  the  interval  perceived  to  have  the  higher  overall  level. 
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Thus,  on  trials  in  which  the  HIGH  sample  has  the  higher  overall  level  the  listener 
will  usually  respond  correctly,  on  trials  in  which  the  LOW  sample  has  the  higher 
overall  level,  the  listener  should  more  often  respond  incorrectly.  Indeed,  if  overall 
level  is  the  cue  used  by  listeners,  then  the  trial-by-trial  data  across  all  conditions 
of  the  experiment  should  converge  on  a  single  psychometric  function;  the  abscissa 
for  this  function  would  be  the  difference  between  the  overall  level  of  HIGH  and 
LOW  samples  on  each  trial.  Fig  3.  shows  the  results  this  analysis.  To  obtain  a 
percent  correct  value  at  each  level  difference,  the  trial-by-trial  data  were 
accumulated  into  1  dB  bins;  thvs  the  percent  correct  at  say  5  dB  is  actually  the 
percent  correct  for  all  trials  in  which  the  level  difference  between  the  HIGH  and 
LOW  samples  was  between  4.5  and  5.5  dB.  Each  panel  represents  the  data  from 
a  different  subject;  the  different  symbols  correspond  to  the  different  sample  sizes 
(n).  The  solid  curve  in  each  panel  is  the  best  Fitting  logistic  (see  Bush,  1963). 
The  data  do  indeed  tend  to  converge  on  a  single  psychometric  function.  Note  also 
that  for  a  performance  level  of  75%  correct,  the  overall  level  difference  for  all 
subjects  is  near  1  dB  -  a  normal  difference  limen  for  intensity  in  the  2IFC 
procedure.  This  analysis  provides  a  necessary,  not  a  sufficient,  test  of  model  4.  It 
demonstrates  nonetheless  how  the  trial-by-trial  data  may  be  used  to  gain  additional 
insights  into  the  processes  underlying  discrimination  performance. 

Another  use  of  the  trial-by-trial  data  is  to  provide  stimulus  weighting 
functions.  These  functions  are  intended  to  specify  the  relative  contribution  of 
different  stimulus  elements  to  the  decision  process  (see  question  3).  The  method 
we  have  chosen  is  simply  to  count  the  agreements  between  the  response  on  each 
trial  and  the  level  difference  of  ;he  ith  element  on  each  trial.  For  example,  if  on 
a  given  trial  the  level  of  the  ith  element  is  higher  on  the  second  interval,  and  if 
the  response  is  to  the  second  interval,  then  the  response  is  scored  as  an  agreement. 
Fig.  4  shows  the  percent  agreements  for  each  of  the  elements  as  derived  from  the 
trial-by-trial  data  of  the  pilot  experiment.  Only  those  trials  in  which  the  ith  level 
difference  exceeded  5  dB  were  included  in  this  analysis.  This  restriction  was 
implemented  to  eliminate  possible  disagreements  resulting  from  the  listeners  inability 
to  discriminate  the  level  difference.  Now  suppose  that  when  all  thirteen  elements 
are  played  (n=13),  the  listener  attends  exclusively  to  the  thirteenth  element.  The 
percent  agreements  in  this  case  should  equal  the  percent  agreements  when  only  one 
element  was  played  (n=l).  The  percent  agreements  will  be  less  than  this  to  the 
extent  that  the  listener  attends  to  the  other  n-1  components.  The  horizontal 
dashed  line  gives  the  percent  agreements  for  n=l.  The  results  give  little  evidence 
that  listeners  differentially  weight  the  various  frequencies  that  comprise  these 
signals.  This  is  not  too  surprising  since  all  elements  constitute  equally  reliable 
sources  of  information.  We  would  not  expect  this  to  be  true  when  different 
reliabilites  are  associated  with  each  element  (study  5). 

Study  2.  Pattern  analysis 

The  relevant  information  distinguishing  many  naturally  occuring  signals  is 
contained  not  only  in  overall  intensity  differences  across  signals,  but  also  in  the 
pattern  of  intensity  variations  across  frequency  within  each  signal.  The  next  study 
focused  on  listeners’  ability  to  perform  spectral  pattern  analysis.  All  conditions 
were  identical  to  the  pilot  experiment  described  earlier  except  on  each  interval  of 
the  2IFC  trial,  half  the  tones  were  drawn  from  the  HIGH  distribution  and  the 
other  half  were  drawn  from  the  LOW  distribution.  On  one  interval,  the  odd 
numbered  tones  were  drawn  from  the  HIGH  distribution,  the  even  numbered  tones 
from  the  LOW  (See  Fig.  IB).  On  the  other  interval,  the  reverse  was  true.  The 
listener’s  task  was  to  identify  the  interval  in  which  the  odd  numbered  tones  were 
drawn  from  the  HIGH  distribution.  To  insure  that  the  subject’s  response  were 
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Fig.  5.  Discrimination  efficiency  as  a  function  of  number  of  components 
in  three  experiments. 


based  on  spectral  pattern  analysis,  we  also  roved  the  overall  level  of  the  stimulus 
on  each  presentation  (see  Green,  1988). 

According  to  the  Theory  of  Signal  Detection,  optimal  performance  for  this 
task  is  identical  to  that  for  t.ie  earlier  magnitude  discrimination  task.  —  The 
performance  the  ideal  detector  is  unaffected  by  how  the  information  is  packaged 

in  frequent’  and/or  time.  It  is  of  interest,  therefore,  to  compare  the  performance 

of  our  objects  in  this  task  to  their  performance  in  the  earlier  magnitude 
discrimination  task.  Both  the  trace  memory  model  and  the  correlated  observations 
model  predict  that  human  performance  in  the  pattern  analysis  task  will  be  near 
optimal.  There  is  no  decay  cf  memory  because  the  relevent  comparisons  are 
between  the  intensities  of  components  that  occur  simultaneously  with  one  another. 
There  is  no  effect  of  common  internal  noise  because  the  common  noise  is 
subtracted  out  in  the  differencing  of  components.  In  contrast,  the  nonoptimal 

decision  model  predicts  that  human  pattern  analysis  will  be  worse  than  magnitude 

analysis. 

The  results  of  this  experiment  (squares)  are  compared  with  the  results  of  the 
magnitude  analysis  experiment  (circles)  in  Fig.  5.  The  data  are  plotted  as 
measures  of  discrimination  efficiency,  eta  =  (d’obt/d’opt)2.  It  is  clear  that  spectral 
pattern  analysis  is  significantly  poorer  for  our  subjects  than  simple  intensity 
discrimination.  The  difference  increases  with  the  number  of  components  in  the 
stimulus.  Though  the  results  support  the  nonoptimal  decision  model,  we  can  not 
rule  out  the  possibility  that  roving  overall  stimulus  level  may  have  had  a 
detrimental  effect  on  performance.  Experiments  are  underway  to  test  this 
hypothesis.  Additional  tests  of  these  models  will  involve  discrimination  of  patterns 
across  both  frequency  and  time. 

Study  3.  Differential  Weighting  of  Frequency  Components 

Not  all  frequencies  comprising  naturally  occuring  sounds  be  expected  to  carry 
the  same  amount  of  information  for  discrimination.  Obviously,  those  frequency 
components  conveying  the  greatest  amount  of  information  should  be  given  greatest 
weight  in  the  decision  process.  Study  3  investigated  listeners’  ability  to  select 
weights  appropriate  to  the  information  content  of  the  individual  frequency 
components  of  the  complex.  The  first  experiment  represented  an  extreme  case  in 
which  all  information  relevant  to  classification  was  contained  in  a  single  component, 
the  one  at  1  kHz.  On  one  interval  of  the  2IFC  trial,  the  level  of  the  1-kHz 
component  was  drawn  from  the  HIGH  distribution.  On  the  other  interval,  the 
level  of  this  component  was  drawn  from  the  LOW  distribution.  The  levels  of  all 
other  tones  on  both  intervals  was  drawn  from  the  LOW  distribution  (See  Fig.  1C). 
The  listener’s  task  was  to  select  the  interval  in  which  the  1-kHz  component  was 
selected  from  the  HIGH  distribution.  We  were  quite  surprised  to  find  that  several 
of  our  best  subjects  could  not  perform  above  chance  on  this  task  (triangles  of  Fig. 
5),  even  after  considerable  practice.  We  had  expected  that  subjects’  performance 
would  be  near  optimal.  They  would  only  need  to  focus  their  attention  on  the 
critical  band  containing  the  single  1-kHz  component  and  ignore  all  other 
components.  Apparently  they  were  unable  to  do  this.  This  represents  a  rather 
severe  departure  from  the  critical  band  principle,  one  that  may  be  related  to  recent 
results  obtained  by  Neff  and  Green  (1987).  We  intend  to  pursue  this  result 
further  by  increasing  the  information  (the  d’  )  provided  by  the  1-kHz  component, 
and  by  slowly  increasing  the  information  conveyed  by  the  other  components. 
Weighting  functions  derived  from  the  trial-by-trial  data  should  indicate  whether  or 
not  listeners  selectively  weight  the  individual  frequency  components  according  to 
their  information  content. 
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In  a  previous  article  [Lutfi,  J.  Acoust,  Soc.  Am.  76,  1045-1050  (1984)  ],  the  following  relation 
was  used  to  predict  measures  of  frequency  selectivity  obtained  in  forward  masking  from 
measures  obtained  in  simultaneous  masking:  F(g)  =  G  +  H(g)  —  H( 0),  where,  for  a  given 
masker  level,  /^is  the  amount  of  forward  masking  (in  dB)  as  a  function  of  signal-masker 
frequency  separation  (g),H  is  the  amount  of  simultaneous  masking,  and  G  is  the  amount  of 
forward  masking  forg  =  0.  In  the  present  study,  the  relation  was  tested  for  a  wider  range  of 
signal  and  masker  frequencies,  masker  levels,  and  signal  delays.  The  relation  described 
thresholds  from  all  conditions  well  with  the  inclusion  of  one  free  parameter  A  corresponding  to 
a  constant  frequency  increment,  F(g)  =  G  +  H{g  +  A )  -  H(A).  The  parameter  A  was 
required  to  account  for  observed  shifts  in  the  frequency  of  maximum  forward  masking.  It  is 
argued  that  a  single  tuning  mechanism  can  account  for  commonly  observed  differences 
between  simultaneous-  and  forward-masked  measures  of  frequency  selectivity. 

PACS  numbers:  43.66.Ba,  43.66.Dc,  43.66.Fe  [WAY] 


INTRODUCTION 

Forward  masking  refers  to  the  elevation  in  the  threshold 
of  a  signal  presented  shortly  after  the  masker  has  terminated. 
The  residual  effect  of  the  masker  offers  a  means  of  measuring 
auditory  frequency  selectivity  free  from  intrusive  interac¬ 
tions  that  may  occur  when  the  signal  and  the  masker  are 
played  simultaneously  (e.g.,  Egan  and  Hake,  1950;  Green¬ 
wood,  1971 ).  Unfortunately,  differences  exist  among  simul¬ 
taneous-  and  forward-masked  measures  that,  after  many 
studies,  are  still  not  well  understood.  By  far,  the  largest  dif¬ 
ferences,  and  those  that  have  received  the  most  attention,  are 
observed  among  psychophysical  tuning  curves.  This  esti¬ 
mate  of  frequency  selectivity  gives  the  level  of  the  masker  at 
each  frequency  necessary  to  mask  a  fixed-level,  fixed-fre¬ 
quency  signal.  Typically,  tuning  curves  are  observed  to  be 
narrower  when  measured  in  forward  masking  than  when 
measured  in  simultaneous  masking  (Duifhuis,  1976;  Hout- 
gast,  1972,  1974;  Lutfi,  1984;  Moore  et  al.,  1984;  Moore, 
1978;  Weber,  1983;  Wightman  et  al.,  1977).  There  is  little 
agreement  regarding  the  mechanisms  underlying  this  result, 
although  it  has  commonly  been  assumed  that  at  least  two 
separate,  frequency-selective  processes  are  involved  (Duif¬ 
huis,  1976;  Houtgast,  1972, 1974;  Moore,  1978;  O’Loughlin 
and  Moore,  1981;  Terry  and  Moore,  1977;  Weber,  1983; 
Wightman  etaL,  1977).  Forward  masking  is  believed  to  be 
fundamentally  different  from  simultaneous  masking  in  that 
it  reflects  the  operation  of  these  additional  frequency  selec¬ 
tive  processes. 

The  decision  to  invoke  additional  tuning  mechanisms 
came  after  physiological  studies  had  accumulated  evidence 
of  suppression  from  single-unit  recordings  in  the  cat's  audi- 

*’  Some  of  the  data  of  this  article  were  reported  earlier  in  a  NATO  Ad¬ 
vanced  Research  Workshop  ( Weber  and  Lutfi,  in  Auditory  Frequency  Se¬ 
lectivity.  edited  by  B.  C.  J.  Moore  and  R.  D.  Patterson  (Plenum,  New 
York,  1986)). 


tory  nerve  (Kiang  et  al.,  1965),  and  evidence  for  a  physio¬ 
logically  vulnerable  “second  filter"  (Evans,  1975).  Appar¬ 
ent  similarities  suggested  possible  connections  between  these 
physiological  observations  and  the  differences  observed 
among  psychophysical  tuning  curves.  Weber  ( 1983)  has  re¬ 
viewed  three  such  theories  in  detail  and  has  rejected  one  of 
them.  Later  interpretations  were  to  implicate  “off-frequency 
listening”  (O’Loughlin  and  Moore,  1981)  and  “cuing"  ef¬ 
fects  (Teny  and  Moore,  1977;  Moore,  1978).  However,  the 
frequency-dependent  nature  of  these  effects  preserved  the 
general  assumption  that  differences  among  tuning  curves 
somehow  reflect  additional  frequency-selective  processes 
operating  in  forward  masking. 

More  recently,  articles  have  begun  to  question  the  ex¬ 
tent  to  which  additional  tuning  mechanisms  are  involved.  In 
a  contemporary  review  of  the  literature  of  frequency  selec¬ 
tivity,  Jesteadt  and  Norton  ( 1985)  note  that  forward-mask¬ 
ing  tuning  curves  may  broaden  markedly  at  high  signal  lev¬ 
els,  while  simultaneous-masking  tuning  curves  appear  to 
change  little  (Stelmachowicz  and  Jesteadt,  1984).  They  sug¬ 
gest  that  forward-masking  tuning  curves  may  be  narrower 
than  simultaneous-masking  tuning  curves  only  for  moder¬ 
ate-  and  low-level  signals;  for  high-level  signals,  forward- 
masking  tuning  curves  might  actually  be  broader.  Subse¬ 
quent  data  of  Moore  and  Glasberg  ( 1986)  indicate  that  the 
difference  between  simultaneous-  and  forward-masking  tun¬ 
ing  curves  is  reduced  slightly  at  high  signal  levels.  Nelson 
and  Freyman  (1984)  report  a  similar,  perhaps  related, 
broadening  of  tuning  curves  with  increasing  signal  delay 
(also  see  Kidd  and  Feth,  1981;  Small  and  Busse,  1980). 
They  show  that,  if  signal  level  is  selected  to  equate  the  tips  of 
the  tuning  curves,  the  tuning  curves  do  not  change  signifi¬ 
cantly  with  signal  delay.  Bacon  and  Moore  (1986)  found 
that  the  difference  between  simultaneous-  and  forward- 
masking  tuning  curves  also  depends  on  the  temporal  place¬ 
ment  of  the  signal  within  the  simultaneous  masker.  When 
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the  signal  occurs  at  either  end  of  the  simultaneous  masker 
( the  trailing  end  being  the  typical  placement  in  tuning  curve 
experiments),  forward-masking  tuning  curves  do  appear 
significantly  narrower.  However,  when  the  signal  occurs  in 
the  temporal  center  of  the  simultaneous  masker,  this  differ¬ 
ence  is  much  reduced.  Even  when  tuning  curves  have  shown 
large  differences  in  simultaneous  and  forward  masking,  the 
role  of  additional  tuning  mechanisms  has  been  questioned 
since  other  measures  of  frequency  selectivity  fail  to  show 
such  large  differences  (Lutfi,  1984;  Weber  and  Lutfi,  1986). 
One  such  measure,  the  filter  function,  gives  signal  threshold 
as  a  function  of  masker  frequency  for  a  fixed-level  masker. 
Lutfi  ( 1984)  reports  filter  functions  that  are  essentially  par¬ 
allel  in  simultaneous  and  forward  masking  over  a  wide  range 
of  masker  levels. 

The  recent  studies  complicate  the  interpretation  of  for- 
ward-masking  tuning  curves.  They  suggest  that  differences 
in  measures,  once  thought  to  reflect  the  operation  of  addi¬ 
tional  tuning  mechanisms,  may,  in  large  part,  be  attributed 
to  interactions  among  the  effects  of  masker  frequency,  mask¬ 
er  level,  and  signal  delay  that  are  peculiar  to  the  tuning  curve 
experiment.  Presently,  it  is  difficult  to  determine  the  influ¬ 
ence  of  such  interactions.  The  previous  studies  have  general¬ 
ly  focused  on  the  effects  of  one  or  two  of  these  factors  while 
holding  the  remaining  factor(s)  constant.  Specific  values 
chosen  for  the  remaining  factors  may,  therefore,  have  played 
a  role  in  producing  the  observed  differences  among  tuning 
curves.  The  purpose  of  the  present  study  is  to  investigate  the 
interaction  among  all  three  factors  rather  than  to  provide  a 
fine-grain  analysis  of  any  one.  Comparatively  few  masker 
frequencies  were  used  so  that  tuning  curves  and  filter  func¬ 
tions  could  be  obtained  for  a  wider  range  of  signal  frequen¬ 
cies,  masker  levels,  and  signal  delays  than  has  been  typical  of 
any  one  study.  Based  on  the  results,  it  is  argued  that  differ¬ 
ences  commonly  observed  between  simultaneous-  and  for¬ 
ward-masking  tuning  curves  are  largely  epiphenomena  of 
the  tuning  curve  procedure. 

I.  METHOD 

Filter  functions  and  tuning  curves  were  obtained  in  si¬ 
multaneous  and  forward  masking  by  measuring  threshold 
for  a  brief  sinusoidal  signal  in  the  presence  of  a  variable- 
frequency,  narrow-band  noise  masker.  Filter  functions  were 
obtained  for  three  signal  frequencies  {ft  =  0.5,  1.0,  and  2.0 
kHz);  tuning  curves  were  obtained  for  two  signal  frequen¬ 
cies  (/,  =  0.5  and  2.0  kHz).  The  masker  frequencies  (/) 
varied  in  proportion  to  the  signal  frequency 
(/-/,)//,=  -0.3,  -0.2,  -0.05, 0.0, 0.05, 0.1, and 0.2. 
In  simultaneous  masking,  the  offset  of  the  signal  (0  voltage 
point)  coincided  with  the  offset  of  the  masker.  In  forward 
masking,  the  onset  of  the  signal  followed  the  offset  of  the 
masker  by  5,  10,  20,  or  40  ms.  Masker  level  varied  from  30- 
90  dB  SPL  depending  on  the  particular  combination  of 
masker  frequency  and  signal  delay.  Complete  filter  func¬ 
tions  were  obtained  for  masker  levels  of  50-80  dB  SPL. 
These  data  were  then  used  to  derive  tuning  curves  at  signal 
levels  of  30-60  dB  SPL.  The  details  of  this  derivation  are 
described  in  Sec.  II.  Not  all  filter  shapes  and  tuning  curves 
were  obtained  for  all  possible  combinations  of  level  and  sig¬ 
nal  delay. 


In  simultaneous  masking,  for  relative  masker  frequen¬ 
cies  ( /— /, )//,  =  —  0.3  and  -  0.2,  a  control  measure  was 
taken  to  prevent  the  detection  of  aural  combination  bands 
generated  by  signal-masker  interaction  (see  Greenwood, 
1971).  A  low-level  band  of  noise  (50  Hz  wide  and  30  dB 
below  the  level  of  the  primary  masker)  was  gated  on  and  off 
in  the  same  manner  as  the  signal.  The  center  frequency  of  the 
additional  noise  band  was  set  equal  to  the  center  frequency 
of  the  most  audible  aural  combination  band  at  2 f—f, 
(Greenwood,  1971 ).  The  amount  of  masking  produced  by 
this  additional  noise  band  alone  was  always  25  dB  or  more 
below  that  produced  by  the  primary  masker,  and  so  it  was 
not  expected  to  produce  any  additional  masking  of  the  sig¬ 
nal. 

A.  Stimuli 

The  signal  was  a  10-ms  sinusoid,  shaped  with  5-ms,  Kai¬ 
ser  ( fV0  =  0.2)  onset  and  offset  ramps  (Childers  and  Durl- 
ing,  1975).  This  ramp  has  the  desirable  property  that  the 
spectral  sidelobes  are  more  than  70  dB  down  from  the  pri¬ 
mary  lobe  within  20%  of  the  primary  lobe  center  frequency. 
The  narrow-band  noise  maskers  had  3-dB  bandwidths  of  50 
Hz.  They  were  gated  on  and  off  with  5-ms  Kaiser  ramps  for  a 
total  duration  (between  the  0  voltage  points)  of  200  ms.  All 
stimuli  were  digitally  (PDP-1 1/40)  synthesized  and  output 
through  14-bit  DACs,  low-pass  filtered  at  the  4-kHz  cutoff 
frequency  of  Unigon  (model  LP-120,  120  dB/oct)  and 
Khron-hite  (model  3343,  96  dB/oct)  filters.  The  narrow- 
band  noise  maskers  were  randomly  sampled  from  a  3-s  noise 
file.  The  levels  of  all  stimuli  were  controlled  by  programma¬ 
ble  attenuators,  and  all  stimuli  were  presented  over  TDH-49 
headphones  (with  65001  cushions)  to  the  right  ear  of  sub¬ 
jects  seated  in  a  IAC,  double-wall,  sound-attenuated 
chamber. 

B.  Procedure 

In  all  conditions,  signal  threshold  was  the  dependent 
variable.  Signal  thresholds  were  obtained  in  daily  2-b  ses¬ 
sions  using  a  two-interval,  forced-choice,  adaptive  proce¬ 
dure  (see  Levitt,  1971 ).  Threshold  estimates  were  based  on 
the  average  of  the  last  eight  reversals  in  each  adaptive  run 
after  the  first  two  reversals  had  been  rejected.  Five  such  esti¬ 
mates  were  obtained  on  different  days  for  each  condition  of 
the  experiment.  The  lowest  and  highest  of  the  five  estimates 
were  rejected  and  the  remaining  three  were  averaged  to  ar¬ 
rive  at  the  final  threshold  estimate.  The  standard  error  of  the 
trimmed  mean  exceeded  3  dB  for  5%  of  the  cases  (see  Bar¬ 
nett  and  Lewis,  1978  for  information  regarding  the  use  of 
trimmed  means).  For  all  subjects,  the  pattern  of  results  was 
quite  similar.  Therefore,  the  data  were  further  averaged 
across  subjects. 

Four  normal-hearing  individuals  were  paid  observers  in 
each  phase  of  the  study,  although  the  same  four  observers 
did  not  participate  in  each  phase.  One  subject  was  unable  to 
continue  after  data  had  been  collected  for  the  2.0-kHz  signal. 
Data  for  the  0.5-kHz  signal  were,  therefore,  collected  with  a 
replacement  subject.  A  second  replacement  was  required  be¬ 
fore  collecting  data  for  the  1 .0-kHz  signal.  The  ages  of  the 
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subjects  ranged  from  21-30  years.  Each  subject  was  tested 
individually. 

II.  RESULTS 

A.  Filter  functions  in  simultaneous  and  forward  masking 
as  a  function  of  signal  frequency 

Figure  1  shows  simultaneous-  and  forward-masked 
thresholds  as  a  function  of  the  relative  masker  frequency  for 
each  of  the  three  signal  frequencies.  Masker  level  is  80  dB 
SPL  and  signal  delay  in  forward  masking  is  3  ms.  Delaying 
the  signal  results  in  an  overall  reduction  in  masked  thresh¬ 
old,  but  the  reduction  in  threshold  does  not  always  appear  to 
be  the  same  at  each  masker  frequency.  For  instance,  thresh¬ 
old  for  the  2.0-kHz  signal  is  greatest  in  simultaneous  mask¬ 
ing  when  the  masker  frequency  is  2.0  kHz.  In  forward  mask¬ 
ing,  however,  the  1.9-kHz  masker  produces  the  highest 
threshold.  The  effect  is  also  evident  for  the  0.5-  and  1.0-kHz 
signals.  In  each  case,  the  maximum  masking  frequency 
(MMF)  in  forward  masking  occurs  at  a  frequency  just  be¬ 
low  that  obtained  in  simultaneous  masking.  Similar  shifts  in 
the  MMF  in  forward  masking  have  been  reported  previously 
by  a  number  of  investigators  (Ehmer  and  Ehmer,  1969; 
Kidd  and  Feth,  1981;  Munson  and  Gardner,  1950;  Nelson 
and  Freyman,  1984;  Vogten,  1978a,b;  Widin  and  Viemeis- 
ter,  1979a,b;  Zwicker  and  Jaroszewski,  1982).  These  shifts 
have  been  examined  most  extensively  in  the  context  of  psy¬ 
chophysical  tuning  curve  experiments.  Therefore,  discus¬ 
sion  of  them  is  reserved  for  the  section  on  tuning  curves. 

The  simultaneous-masked  thresholds  are  described  ade¬ 


quately,  on  the  selected  coordinates,  oy  niter  tuncuons  with 
two  linear  segments.  Expressing  relative  masker  frequency 
asg  =  ( f—f, )//,,  the  filter  functions  are  of  the  form 


H(g)  = 


-0,\g-a\,  g  <a, 
-0u\g-a\,  g>a. 


where  and  0,  are,  respectively,  the  unsigned  slopes  of  the 
upper  and  lower  branches  of  the  function,  a  corresponds  to 
the  frequency  at  the  break  point,  and  is  signal  threshold 
( in  dB )  at  the  breakpoint.  The  parameter  a  allows  the  MMF 
to  be  estimated  slightly  above  or  below  the  signal  frequency. 
The  curves  drawn  through  the  simultaneous-masked  thresh¬ 
olds  were  obtained  by  selecting  values  of  0^,  0,,  a,  and 
Tm„  satisfying  the  least-squares  criterion.  The  results  of  the 
regression  for  the  individual  and  mean  data  are  shown  in 
Table  I  (80-dB  masker  level).  Each  curve  represents  the 
regression  of  four  parameters  on  only  seven  points,  so  the 
proportion  of  variance  accounted  for  (r2)  is  predictably 
high.  In  subsequent  analysis,  these  simultaneous-masking 
filter  functions  will  be  used  to  predict  the  forward-masked 
thresholds. 

The  degree  of  frequency  selectivity  exhibited  by  simul¬ 
taneous-masking  filter  functions  is  estimated  by  the  steep¬ 
ness  of  the  unsigned  slopes,/?,  and  0, .  For  the  1.0-  and  2.0- 
kHz  signals,  the  low-frequency  slope  is  small  relative  to  the 
high-frequency  slope  reflecting  the  familiar  upward  spread 
of  masking.  The  average  3-dB  bandwidths  derived  from  the 
slopes  are  83,  83,  and  252  Hz,  respectively,  for  the  0.5,  1.0-, 
and  2.0-kHz  signals.  These  values  are  in  reasonable  agree- 


-a.3  -a. 2  -a.  i  a. a  a.  i  e.2 


a.3  -e.2  -a.  i  a. a  a.i  e.2 


-a. 3  -a. 2  -a.  i  a.a  a.  i  e.2 


Relative  MasKer  Frequency  <fm-ls>/ls 


j  FIG.  1.  Simultaneous-  and  forward-masked  thresholds  (circles  and  triangles,  respectively)  as  a  function  of  relative  masker  frequency  g  for  three  signal 
•  frequencies /r  =  500,  1000,  and  2000  Hz.  Signal  delay  is  5  ms.  The  data  are  the  averaged  threshold  of  four  subjects.  The  filter  functions  drawn  through  the 

j  simultaneous-masked  thresholds  were  obtained  by  least -squares  regression  according  to  Eq.  ( I ).  The  forward-masking  filter  functions  were  derived  from 
1  the  simultaneous-masking  filter  functions  according  to  Eq.  ( 2 ) . 
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TABLE  I.  Parameters  for  simultaneous-masking  filter  funcuons-  Therms  error  refers  to  the  root-mean-square  deviations  of  the  data  from  the  filled  curves. 


Parameters 


Subject 

f 

Hz 

Masker  level 

dBSPL 

a 

P. 

dB 

0, 

dB 

T*.. 

dBSPL 

r 

rm  s  rrror 

dB 

JL 

500 

80 

0.025 

68,8 

25.3 

76.1 

0.892 

1  2 

70 

0.050 

55.3 

27.0 

66  9 

0.920 

11 

60 

0.067 

56.7 

56  1 

60.3 

0.934 

1.9 

50 

0.021 

26.2 

37  1 

49.2 

0.825 

1.2 

1000 

80 

-  0.033 

72.9 

64.2 

74.5 

0.732 

3.7 

2000 

80 

0.021 

131.0 

29.4 

73.6 

0.995 

0.5 

70 

0.037 

101.0 

44.8 

62.2 

0914 

I  7 

60 

0.074 

146.0 

59.3 

56  3 

0.979 

1.4 

50 

0.050 

73.8 

42.3 

46.1 

0.946 

09 

DO 

500 

80 

0.000 

30.7 

29  9 

70,5 

0.881 

1.1 

70 

0.056 

65.8 

35.3 

62  6 

0.842 

1.8 

60 

0.062 

35.0 

34.7 

53.2 

0  736 

2.5 

50 

0.049 

2.7 

35.1 

42.7 

0.891 

0.9 

1000 

80 

0.002 

94.8 

55.8 

81.3 

0.892 

2.1 

2000 

80 

0.091 

273.0 

51.5 

76.8 

0.927 

2.6 

70 

0.042 

197.0 

94.5 

70.0 

0.978 

2.0 

60 

0.046 

143.0 

105.0 

56.1 

0.902 

3.5 

50 

0.049 

137.0 

109.0 

50.4 

0.972 

1.2 

BL 

500 

80  ■ 

-  0.034 

31.7 

62.3 

73.8 

0.962 

1.0 

70 

0.050 

47.2 

38.0 

62.9 

0.977 

0.7 

60 

0.029 

9.3 

51.7 

53.0 

0.982 

0.7 

50 

0.082 

50.0 

53.9 

47.6 

0.988 

0.9 

1000 

80  ‘ 

0.013 

133.5 

50.8 

77.0 

0.873 

2.8 

FB 

2000 

80 

0.000 

126.0 

27.6 

73.7 

0.975 

1.3 

70 

0.000 

115.0 

47.5 

63.1 

0.967 

1.1 

60 

0.000 

80.0 

67.2 

54.4 

0.874 

2.5 

50 

0.017 

87.1 

66.2 

47.2 

0.944 

1.3 

DD 

500 

80 

0.000 

22.8 

46.3 

69.7 

0.829 

2.2 

70 

0.000 

53.4 

50.8 

60.8 

0.951 

0.9 

60 

0.030 

4.4 

27.4 

49.0 

0.922 

1.1 

50 

0.000 

1 1.4 

164.0 

42.5 

0.997 

0.7 

BZ 

1000 

80 

-0.013 

142.9 

39.2 

76.6 

0.979 

1.4 

DD 

2000 

80 

0.049 

145.0 

15.4 

71.2 

0.958 

1.5 

70 

0.034 

144.0 

464 

66.1 

0.959 

1.5 

60 

0.021 

105.0 

69.0 

54.2 

0.988 

0.9 

50 

-0.005 

54.7 

71.6 

43.5 

0.851 

2.1 

Mean 

500 

80 

0.000 

35.7 

369 

72.3 

0.993 

0.5 

70 

0.040 

57.1 

37.6 

63.4 

0.963 

1.0 

60 

0.050 

26.9 

41.5 

53.9 

0.982 

0.7 

50 

0.042 

14.1 

43.0 

45.0 

0.951 

0.9 

1000 

80 

-0.012 

107.0 

54.5 

76.4 

0.921 

2.1 

2000 

80 

0.030 

143.0 

28.5 

73.7 

0.990 

1.0 

70 

0.030 

138.0 

58.4 

65.3 

0.991 

1.1 

60 

0.035 

111.0 

72.0 

550 

0.961 

1  5 

50 

0.032 

83.1 

74.8 

46.9 

0.954 

1.4 

ment  with  bandwidth  estimates  of  the  auditory  filter  ob-  F(g)  =  H(g  +  A)  +  f,  (2) 

tained  in  simultaneous  masking  with  less  intense  maskers' 

Weber  ( 1977),  for  instance,  reports  3-dB  band  widths  of  97  where  A  and  f  are  constants.  The  parameter  A  merely  repre- 

and  217  Hz,  respectively,  for  a  1.0-  and  2.0-kHz  signal.  Pat-  sents  a  shift  in  the  breakpoint  frequency;  it  provides  an  esti- 

terson  ( 1976)  obtained  a  3-dB  bandwidth  of  69  Hz  for  a  0.5-  mate  of  the  shift  in  the  MMF  in  forward  masking.  The  pa- 

kHz  signal.  rameter  f  gives  the  corresponding  change  in  Tm„ .  The 

We  wish  to  determine  whether  or  not  the  degree  of  fre-  simultaneous-masking  filter  functions  were  used  to  predict 
quency  selectivity,  as  indicated  by  the  slopes  of  the  filter  the  forward-masked  thresholds  by  estimating  the  constants 

functions,  differs  significantly  in  forward  masking.  If  the  fil-  A  and  £  in  Eq.  (2).  The  results  are  the  curves  drawn  through 

ter  function  in  simultaneous  masking  is  designated  H(g),  the  forward-masked  thresholds  in  Fig.  1.  Table  II  (M  =  80 

then  all  filter  functions  having  identical  slopes  are  of  the  dB,  (  =  5  ms)  gives,  for  the  individual  and  mean  data,  the 

form  values  of  A  and  f  satisfying  the  least-squares  criterion.  For 
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•TABLE  II.  Parameters/,  and  7  (entnes)  for  forward-masking  filter  functions  Note  that  ther  value  in  the  right-hand  column  refers  to  the  proportion  ol  total 
variability, accounted  for  in  all  conditions  of  the  corresponding  row 


Masker  level 


Subject 

/ 

Hz 

50 

4  f 

X 

60 

70 

x  i 

80  dB  SPL 

4  e 

r 

rms  error 

dB 

JL 

500 

0.021 

-4.22 

0.051 

-  6  54 

-  0.025 

-  8.70 

-  0.032 

-  11.2 

0.959 

1.5 

1000 

0020 

-  14.2 

0.885 

2.4 

2000 

0.050 

-  5  97 

0.098 

3  14 

0  051 

-  11.8 

0.020 

-  18.1 

0.951 

1.8 

DO 

500 

0.000 

-  1.25 

0.019 

-  4.49 

-0  009 

-6  11 

-  0.070 

-9.32 

0.912 

2.4 

1000 

-0.081 

-9.96 

0.657 

4.1 

2000 

0.035 

-  4.71 

0.066 

-  12.0 

0.057 

-  16  7 

0.065 

-20.4 

0.895 

2.8 

BL 

500 

0.012 

-2.15 

0.052 

3.35 

0.013 

-  7.66 

0.055 

-9.53 

0.952 

1.8 

1000 

0.046 

-  14.3 

0.915 

2.4 

FB 

2000 

0.054 

-  7.61 

0.054 

-  11.7 

0.056 

-  14.4 

0.045 

-  18.4 

0.934 

1.9 

DD 

500 

0.024 

-0.904 

0.055 

-2.61 

-0.012 

-  3.08 

0.126 

—  6.61 

0.955 

14 

BZ 

1000 

-0.0005 

-  15.1 

0.888 

2.4 

DD 

2000 

0.050 

-  3.07 

0.041 

-6.18 

0.045 

-  12.3 

0.037 

-  11.3 

0.943 

2.1 

Mean 

500 

0.031 

-  2.06 

0.044 

-4.22 

0.004 

-  6.08 

0.034 

-9.76 

0.978 

1.3 

1000 

0.026 

-  14.7 

0.972 

1.2 

2000 

0.055 

-6.59 

0.065 

-8.58 

0.055 

-  13.6 

0.045 

-  17.2 

0.966 

1.6 

j  the  mean  data,  the  proportion  of  variance  accounted  for  by  B.  Filter  functions  in  simultaneous  and  forward  masking 

j  the  two-parameter  fit  is  0.959, 0.972,  and  0.897,  respectively,  as  a  function  of  masker  level 

for  the  0.5-,  1.0-,  and  2.0-kHz  signals.  Most  of  the  residual 

>•  variance  can  be  attributed  to  measurement  error.  A  second  Each  of  the  panels  of  Fig.  2  gives  simultaneous-  and 

regression,  in  which  all  four  parameters  (/?„,  /?,,  a,  and  forward-masked  thresholds  for  the  2.0-kHz  signal  obtained 
Tmmx )  were  allowed  to  vary,  indicated  no  significant  or  sys-  at  a  different  masker  level.  The  signal  delay  in  forward  mask- 
tematic  departure  from  the  values  of  the  two-parameter  fit.  ing  was  5  ms.  As  before,  the  filter  functions  in  simultaneous 

In  particular,  the  slope  values  were  equally  often  greater  masking  were  estimated  by  selecting  values  of /?„,/?,,  a,  and 

'j  than  and  less  than  those  of  the  two-parameter  fit.  The  largest  in  Eq.  ( 1 )  satisfying  the  least-squares  criterion.  As  Ta- 

•j  departures  in  mean  slope  values  occurred  for  the  2.0-kHz  ble  I  shows,  the  four-parameter  regression  continues  to  ac- 

signal:  a  /?,  of  47.9  (four-parameter  fit)  compared  to  28.5  count  for  a  high  proportion  of  the  variance  in  the  simulta- 

( two-parameter  fit),  and  a  0U  of  155  compared  to  143.  In  neous-masked  thresholds  at  the  lower  masker  levels.  The 

this  worst  case,  the  four-parameter  fit  accounted  for  an  addi-  upward  spread  of  masking  with  masker  level  is  evident  in  the 

;  tional  4%  of  the  total  variance.  changing  slopes  of  the  filter  functions.  At  the  lowest  masker 


Relative  Masker  frequency  (fm-fs )/ts 

FIG.  2.  Same  as  Fig.  1  except  the  simultaneous-  and  forward-masking  filter  functions  are  plotted  for  the  2000-Hz  signal,  at  four  masker  levels:  50-80  dB 
SPL.  Quiet  threshold  is  28  dB  SPL  and  is  designated  by  the  knee  in  the  filter  functions  at  this  level. 


167 


J.  Acoust.  Soc.  Am.,  Vol.  83.  No.  1.  January  1988 


Robert  A.  Lutfi:  Interpreting  forward  masking 


167 


level,/?u  and/?,  arc  nearly  equal;  the  filter  function  is  roughly 
symmetric.  As  masker  level  grows,  0U  increases  while  /?, 
decreases  so  that  the  filter  function  becomes  highly  asymme¬ 
tric.  Such  changes  in  masking  asymmetry  are  evident  for  all 
four  subjects  and  replicate  those  commonly  observed  in  si¬ 
multaneous  masking  (e.g.,  Egan  and  Hake,  1950;  Lutfi  and 
Patterson,  1984;  Patterson  and  Nimmo-Smith,  1980;  Vog- 
ten,  1978a). 

The  forward-masking  filter  functions  were  derived  as 
before  from  the  simultaneous-masking  filter  functions  using 
Eq.  ( 2 ) .  The  estimates  of  X  and  f  for  the  individual  and  mean 
data  are  given  in  Table  II.  The  two-parameter  fits  to  the 
forward-masked  thresholds  are  quite  good.  Excluding  the 
80-dB  masker  condition,  which  was  described  earlier,  the 
forward-masking  filter  functions  account  for  97%  or  more 
of  the  total  variability  in  the  forward-masked  thresholds  at 
each  level.  This  means  that  the  level-dependent  changes  in 
masking  asymmetry  observed  in  simultaneous  masking  are 
maintained  in  forward  masking.  Note  again  that  the  esti¬ 
mates  of  T  consistently  place  the  MMF  in  forward  masking 
slightly  below  that  in  simultaneous  masking,  and  slightly 
below  the  frequency  of  the  signal. 

A  similar  pattern  of  results  was  obtained  for  the  0. 5-kHz 
signal  with  the  5-ms  signal  delay.  Figure  3  shows  the  data 
while  Tables  I  and  II  give  the  parameters t>f  the  best-fitting 
filter  functions.  The  forward-masking  filter  functions  for  the 
0.5-kHz  signal  account  for  a  comparably  high  proportion  of 
the  variability  in  both  the  individual  and  the  mean  forward- 
masked  thresholds.  For  the  mean  data,  the  proportion  of 
variance  accounted  for  is  0.978.  Changes  in  masking  asym¬ 
metry  with  level  are  less  pronounced  for  the  0.5-kHz  signal 
than  for  the  2.0-kHz  signal.  At  the  lowest  masker  level,  the 
filter  function  is  asymmetric  with/?„  less  than/?,.  This  asym¬ 
metry  is  also  opposite  to  that  of  the  2.0-kHz  filter  functions. 
Such  reversals  in  masking  asymmetry  are  not  uncommon, 
particularly  at  low  masker  levels  (e.g.,  Lutfi  and  Patterson, 
1984;  Zwicker  and  Jaroszewski,  1982).  As  before,  /?„  in¬ 
creases  with  masker  level  while  /?,  tends  to  decrease.  Conse¬ 


quently,  the  0. 5-kHz  filter  function  becomes  nearly  symmet¬ 
ric  at  the  highest  masker  level. 

An  important  feature  of  both  the  0.5-  and  2.0-kHz  data 
is  the  interaction  that  is  observed  between  the  effects  of 
masker  level  and  signal  delay.  At  any  given  masker  frequen¬ 
cy,  the  threshold  reduction  that  results  from  delaying  the 
signal  is  greater  at  high  masker  levels  than  at  low  ones.  For 
instance,  when  masker  level  is  50  dB,  the  difference  between 
simultaneous-  and  forward-masked  thresholds  for  on-fre- 
quency  maskers  (masker  frequency  equal  to  the  signal  fre¬ 
quency  )  is  about  5  dB.  When  masker  level  is  80  dB,  however, 
the  dB  difference  is  nearly  quadrupled.  The  interaction 
between  masker  level  and  signal  delay  for  on-frequency 
maskers  has  been  described  in  detail  by  Jesteadt  et  al. 
( 1982).  The  data  of  Figs.  2  and  3  indicate  that  the  level- 
delay  interaction  behaves  similarly  for  off-frequency 
maskers. 


C.  Tuning  curves  In  simultaneous  and  forward  masking 
as  a  function  of  signal  level 

Figures  4  and  5  give  tuning  curves  derived  from  the  data 
of  Figs.  2  and  3,  respectively.  The  method  for  deriving  these 
tuning  curves  follows  that  of  Lutfi  ( 1984)  and  Bacon  and 
Viemeister  (1985).  Simultaneous-  and  forward-masking 
functions  were  estimated  for  each  masker  frequency  by  lin¬ 
ear,  least-squares  regression  of  the  mean  thresholds  on 
masker  level.  The  masking  functions  were  then  used  to  com¬ 
pute  the  masker  level  at  each  frequency  corresponding  to  a 
fixed  threshold  of  30, 40,  50,  or  60  dB.  Such  point  estimates 
based  on  the  regression  provide  greater  reliability  than  those 
based  on  a  single  mean  provided  that  the  relation  between 
the  variables  is  truly  linear  ( Cohen  and  Cohen,  1975). Table 
III  gives  the  obtained  slope  and  intercept  values.  The  esti¬ 
mated  masking  functions  describe  the  data  quite  well.  The 
worst  case  is  represented  by  the  function  with  the  smallest 
slope  (0.16),  here  the  proportion  of  variance  accounted  for 


Relative  Masker  Frequency  <fm-fs)/fs 

FIG.  3.  Same  as  Fig.  2  except  simultaneous-  and  forward-masking  filter  functions  are  shown  for  the  500-Hz  signal. 
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Masker  Level.  0(8  SPL 


FIG  4  Simultaneous- (circles)  and  for¬ 
ward  (triangles)  masking  tuning  curves 
at  signal  levels  of  30-50  dB  SPL.  Signal 
frequency  is  2000  Ha  and  signal  delay  is 
5  ms.  The  obtained  tuning  curves  (con¬ 
tinuous  lines)  were  derived  from  the 
masked  thresholds  of  Fig.  2.  The  dashed 
ltnes  are  predictions  derived  from  Eq 
(2).  See  teat  for  details. 


was  91%.  Also  shown  in  Figs.  4  and  5  are  predicted  tuning  the  MMF  noted  earlier  in  the  filter  functions.  Disparities 

curves  (dotted  lines) .  These  curves  were  obtained  by  taking  between  the  tips  of  simultaneous-  and  forward-masking  tun- 

honzontal  cuts  through  the  predicted  filter  functions  of  ing  curves  have  been  reported  previously  by  Vogten 

Figs.  2  and  3.  In  some  cases,  it  was  also  necessary  to  interpo-  ( 1978a, b),  Kidd  and  Feth  (1981),  Nelson  and  Freyman 

late  between  points  on  the  filter  functions  in  order  to  deter-  ( 1984),  and  Widin  and  Viemeister  ( 1979a,b).  In  the  study 

mine  the  tip  of  the  tuning  curve.  by  Vogten,  the  MMF  occurs  at  the  signal  frequency  in  for- 

The  tuning  curves  obtained  for  the  2. 0-kHz  signal  (Fig.  ward  masking,  but  slightly  above  the  signal  frequency  in 

41  are  representative  of  those  reported  in  previous  studies  simultaneous  masking.  Vogten’s  tuning  curves  were  ob- 

(e.g.,  Moore,  1978;  Small,  1959;  Vogten,  1978b;  Weber,  tained  at  low  stimulus  levels.  The  MMF  shifts  observed  in 

1983).  They  have  the  familiar  V  shape  in  which  the  right  the  first  and  second  panels  of  Fig.  4  replicate  those  reported 

branch  of  the  V  is  quite  steep  and  the  left  branch  is  slightly  at  low  levels.  All  of  the  remaining  authors  show  that,  as 

bowed.  Note  also  that  the  tips  of  the  tuning  curves  in  for-  stimulus  intensity  is  increased,  the  MMF  in  forward  mask- 

ward  masking  are  displaced  slightly  to  the  left  of  those  is  ing  shifts  to  a  frequency  below  that  of  the  signal.  This  pattern 

simultaneous  masking.  This  disparity  reflects  the  shifts  in  of  results  is  evident  in  the  third  panel  of  Fig.  4. 


FIG.  5.  Same  as  Fig  4  except  signal  frequen¬ 
cy  is  500  Hz:  the  tuning  curves  were  derived 
from  the  masked  thresholds  of  Fig.  3. 


-a.  2  a. a  a. 2  -a. 2  a. a  a. 2  -a.  2  a. a  a.  2 

Relot ive  Masker  Frequency  <(m-(s)/ts 


Relative  Masker  Frequency 


169 


J.  Acoust  Soc.  Am.,  Vol.  83,  No.  1 ,  January  1 988 


Robert  A.  Lutfi:  Interpreting  forward  masking 


169 


TABLE  111  Siope  jiu  iniercept  vjiuesofihe  niasLiuc  tun*. 

turns  obtained  from  linear  lea’ 

»! -squares  repression 

with  me  thresnmjs 

averaeei 

d  across  \u  meets 

/,  / 

H/  ms 

K 

</-/.  )'f 

Slope 

intercept 

dBSI’L 

n 

r 

500  simultaneous 

-  0  30 

1.07 

-  24  6 

3 

1  4X10 

-  0.20 

1  04 

17  s 

5 

0  uqs 

-  0.05 

1  03 

126 

5 

0,99s 

0  00 

0.85 

3.2 

h 

(1  984 

0.05 

0.84 

4.0 

5 

o  907 

0  10 

O  S3 

l  7 

5 

0  9O< 

0.20 

0.74 

5.4 

4 

0 .967 

5 

-  0  30 

0.73 

-  0.7 

3 

0.98c 

-0  20 

0.7S 

-  4  5 

4 

0  005 

-  0.05 

0.77 

06 

4 

0.994 

0.00 

060 

13.1 

6 

0  994 

0.05 

0  62 

11) 

4 

0.995 

0  10 

0  54 

15.7 

4 

0.992 

• 

0.20 

0,59 

9.5 

4 

0.997 

2000  simultaneous 

-  030 

1.63 

-  Ob. 7 

3 

0  992 

-  020 

1.28 

-  37.2 

5 

0989 

-  0.05 

1.05 

-  13.5 

5 

0  990 

000 

0  89 

1.5 

6 

0.996 

0.05 

0.85 

4.1 

5 

0.992 

0.10 

0.92 

-  8.6 

5 

0  994 

020 

0.51 

7.7 

4 

0.976 

5 

-  0.30 

0.88 

-  24.3 

3 

0975 

-020 

0.69 

-  6  7 

4 

0  989 

-0  05 

0.67 

4.9 

4 

0.988 

000 

0.49 

139 

6 

0  993 

0.05 

0  46 

9.6 

4 

0.975 

0.10 

0.28 

16.3 

4 

0962 

020 

0  16 

20.3 

4 

0914 

10 

-0.20 

0.80 

-  14.3 

4 

0.998 

-0.05 

0.47 

10.2 

4 

1.000 

000 

0  42 

17.0 

6 

0.996 

0.05 

0.36 

11.7 

4 

0.999 

0.10 

040 

7.9 

4 

0953 

20 

-0.20 

0.71 

-  12.0 

4 

0  969 

-005 

0.46 

12.1 

4 

0  989 

0.00 

0.37 

16.2 

6 

0  993 

0.05 

0.44 

6.5 

4 

0979 

0.10 

0.35 

7.8 

4 

0  960 

40 

-0.05 

0.43 

8.6 

4 

0  986 

0.00 

0.19 

22.8 

6 

0.927 

005 

0.31 

11.5 

4 

0.980 

The  tuning  curves  of  Fig.  4  are  further  typical  in  that 
overall  they  appear  narrower  in  forward  masking.  For  in¬ 
stance,  for  the  40-dB  signal,  the  slope  of  the  high-frequency 
branch  of  the  tuning  curve  is  roughly  190  dB/oct  in  simulta¬ 
neous  masking,  while  in  forward  masking  it  is  near  320  dB/ 
oct.  The  respective  slopes  for  the  low-frequency  branch  of 
the  tuning  curve  in  simultaneous  and  forward  masking  are 
40  and  45  dB/oct.  These  values  are  within  the  range  of  val¬ 
ues  that  have  been  obtained  in  previous  studies.  The  dispar¬ 
ity  of  tuning  in  simultaneous  and  forward  masking  also  ap¬ 
pears  to  persist  at  high  signal  levels,  consistent  with  the  data 
of  Moore  and  Glasberg  ( 1986).  Only  the  slope  of  the  low- 
frequency  tail  of  the  forward-masking  tuning  curve  (from 
g  —  —  0.3  to  —  0.2)  appears  to  become  shallower  at  these 
high  signal  levels. 

The  disparity  between  the  tuning  curves  in  simulta¬ 
neous  and  forward  masking  is  related  to  the  masker  level- 
signal  delay  interaction  described  earlier.  Recall  that,  at  any 
given  masker  frequency,  the  threshold  reduction  that  results 


from  delaying  the  signal  is  greater  at  high  masker  levels  than 
at  low.  This  interaction  affects  the  tuning  curve  because,  for 
the  tuning  curve,  masker  level  covaries  with  masker  frequen¬ 
cy.  For  the  low-level,  on-frequency  maskers,  the  threshold 
reduction  produced  by  delaying  the  signal  is  small.  Thus  the 
increment  in  masker  level  necessary  to  compensate  for  the 
threshold  reduction  is  small.  For  the  high-level,  off-frequen¬ 
cy  maskers,  the  threshold  reduction  produced  by  delaying 
the  signal  is  large;  thus  the  corresponding  increment  in 
masker  level  is  large.  The  fact  that  the  tuning  curves  at  2.0- 
kHz  are  narrower  in  forward  masking  may,  therefore,  be 
understood  in  terms  of  a  three-way  interaction  among  the 
effects  of  masker  frequency,  masker  level,  and  signal  delay. 

The  tuning  curves  obtained  for  the  0.5-kHz  signal  are 
shown  in  Fig.  5.  Unlike  the  tuning  curves  for  the  2.0-kHz 
signal,  these  curves  fail  to  evidence  any  significant  difference 
in  terms  of  the  degree  of  apparent  tuning  in  simultaneous 
and  forward  masking.  Unfortunately,  there  are  few  compar¬ 
able  data  in  the  literature  at  this  low  signal  frequency.  V'og- 
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Masked  Threshold.  dB  SPL 


8.3  -8.2  -8.1  0.8  8.1  0.2 


0.3  -8.2  -0.1  0.8  0.1  0.2 


8.3  -0.2  -0.1  0-8  8-1  0-2 


Relative  Masker  frequency-  <(m-fs)/fs 


FIG  6  Forward-masked  thresholds  for  signal  delays  of  5  (circles).  10  (triangles),  20  (squares),  and  40  (crosses)  ms  Signal  frequency  is  2000  Hz.  The 
forward- masking  filter  functions  were  derived  from  the  simultaneous-masking  filter  functions  according  to  Eq.  (2)  Note  that  the  filter  functions  and 
corresponding  thresholds  for  the  10-,  20-,  and  40-ms  delays  have  been  shifted  downward  ( number  of  dB  indicated  to  the  right  of  each  function )  to  improve 
visible  separation. 


ten  ( 1978b)  reports  tuning  curves  at  0.5  and  2.0  kHz  for  one 
subject.  This  subject's  data  agree  with  the  present  data  inas¬ 
much  as  the  difference  in  tuning  between  simultaneous  and 
forward  masking  was  much  less  apparent  for  the  0.5-kHz 
signal.  Moore  et  al.  ( 1 986 )  report  comparable  data  from  two 
subjects.  One  of  these  subjects  showed  narrower  tuning 
curves  in  forward  masking  at  0.5  kHz,  while  the  other  failed 
to  show  any  difference  in  tuning,  at  least  within  the  range  of 
frequencies  used  in  the  present  study. 


D.  Filter  functions  and  tuning  curves  as  a  function  of 
signal  delay 

To  further  test  the  generality  of  Eq.  (2),  we  have  ob¬ 
tained  forward-masked  thresholds  for  the  2.0-kHz  signal  as 
a  function  of  signal  delay.  These  forward-masked  thresholds 
are  given  in  Fig.  6.  The  filter  functions  drawn  through  these 
data  were  derived  from  the  simultaneous-masking  filter 
functions  and  Eq.  (2)  as  before.  Table  IV  gives  the  corre- 


TABLE  IV.  Parameters  A  and  C  (entries)  for  forward-masking  filter  functions  Note  that  the  r  value  in  the  nght-hand  column  refers  to  the  proportion  of 
total  variability  accounted  for  in  all  conditions  of  the  corresponding  row. 


Masker  level 
Subiect  dBSPL 


Signal  delay 


40  ms 

a  : 

rms  error 

r 

dB 

0.040 

-  29.2 

0.931 

1.9 

0.059 

-  22  4 

0915 

2.1 

0.060 

-  20.2 

0.927 

1.5 

0.074 

-  35.2 

0.671 

3  8 

0.025 

-  30  1 

0  641 

3  0 

0.106 

-20.2 

0.840 

2.0 

0,050 

-  29  5 

0.805 

3.2 

0.090 

-  19.7 

0914 

1.7 

0.072 

-  20.0 

0.823 

2  1 

0.006 

-  28.2 

0.864 

2.4 

0.020 

-  23.9 

0.851 

1,8 

0.040 

-  |4.7 

0  864 

1.6 

0.050 

-  29.8 

0,871 

2.2 

0  040 

-  25.1 

0.879 

1.7 

0  067 

-  18.8 

0.901 

1.6 
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sponding  values  of  A  and  C.  and  the  proportion  of  variance 
accounted  for.  Note  that  the  threshold  and  filter  functions 
are  shifted  downward  with  each  delay  to  improve  visible 
separation.  The  data  described  previously  for  the  5-ms  delay 
are  also  included  for  comparison.  Individually,  the  propor¬ 
tion  of  variance  accounted  for  by  the  two-parameter  fits 
tends  to  be  lower  at  longer  signal  delays.  For  subject  DO,  in 
particular,  the  predicted  functions  account  for  less  than 
70%  of  the  total  variability  in  some  cases.  There  were  two 
factors  that  contributed  to  the  lower  proportion  of  variance 
accounted  for  individually.  First,  the  range  over  which 
masking  could  be  measured  at  the  long  delays  was  restricted. 
Consequently,  there  were  fewer  thresholds  and  the  total 
variability  among  thresholds  was  smaller.  Second,  at  the 
long  signal  delays,  the  individual  off-frequency  thresholds 
tended  to  be  more  variable  so  that  there  was  greater  mea¬ 
surement  error.  This  was  most  true  of  subject  DO. 

Turning  to  the  mean  data,  the  two-parameter  fits  ac¬ 
count,  respectively,  for  90%,  88%.  and  87%  of  the  variabil¬ 
ity  in  the  mean  forward-masked  thresholds  at  masker  levels 
of  60-80  dB.  In  each  case,  the  rms  error  between  the  predict¬ 
ed  and  obtained  thresholds  is  close  to  2  dB.  The  correspond¬ 
ing  four-parameter  fits  accounted,  respectively,  for  96%, 
95%,  and  94%  of  the  variability.  The  slopes  of  the  filter 
functions  resulting  from  the  four-parameter  fits  at  the  longer 
signal  delays  cannot  be  considered  very  reliable  given  the 
limited  number  of  data  points  defining  each  curve.  However, 


in  one  case,  these  slopes  did  consistently  and  significantly 
deviate  from  those  predicted  by  the  iwo-parameter  fit.  For 
the  10-ms  signal  delay,  the  slopes  of  the  four-parameter  fits 
on  the  low-frequency  side  were  generally  smaller,  ranging 
from  36.0  dB  for  the  60-dB  level  masker  to  0.2  dB  for  the  80- 
dB  level  masker.  The  corresponding  slopes  for  the  two-pa¬ 
rameter  fits  (see  Table  I )  range  from  72.0-28.5  dB. 

The  first  panel  of  Fig  7  gives  examples  of  tuning  curves 
corresponding  to  these  data.  Predictions  are  shown  as  dotted 
lines  as  before.  The  35-dB  signal  in  this  case  was  selected  to 
be  representative  of  the  low-level  signals  for  which  tuning 
curves  are  most  frequently  reported  in  the  literature.  The 
major  effect  of  increasing  signal  delay  is  an  overall  elevation 
of  the  tuning  curve  accompanied  by  a  shift  in  the  the  tip  of 
the  curve  (the  point  at  the  MMF)  to  a  frequency  below  that 
of  the  signal.  Kidd  and  Feth  ( 1981 )  and  Nelson  and  Frey- 
man  (1984)  report  identical  shifts  in  the  tips  of  tuning 
curves  with  increasing  delay.  In  their  data,  the  shift  in  tips  is 
often  accompanied  by  a  decrease  in  the  slope  of  the  low- 
frequency  branch  of  the  tuning  curve,  as  is  evident  in  the 
predicted  curves  of  Fig.  7.  Nelson  and  Freyman  ( 1984)  re¬ 
port  that  the  differences  among  their  tuning  curves  are  large¬ 
ly  eliminated  if  signal  level-signal  delay  combinations  are 
selected  such  that  the  level  of  maskers  near  the  tips  of  the 
curves  are  equated.  When  this  is  done,  their  tuning  curves 
nearly  superimpose.  The  second  panel  of  Fig.  7  shows  tuning 
curves  in  which  the  level  of  maskers  near  the  tips  of  the 


FIG  7.  Left  panel:  forward-masking 
tuning  curves  for  a  35-dB  SPL  signal  at 
signal  delays  of  5  ( circles )  and  40  ( trian¬ 
gles)  ms.  The  tuning  curves  were  de¬ 
rived  from  the  data  of  Fig.  6  ( see  text  for 
details).  Dashed  lines  represent  predic¬ 
tions  based  on  Eq  (2).  Right  panel:  Sig¬ 
nal  level  is  selected  for  each  signal  delay 
so  as  to  equate  masker  level  near  the  tips 
of  the  tuning  curves.  The  lower  of  the 
two  dashed  curves  gives  the  predictions 
for  the  5-ms  delay:  the  solid  curve  is 
omitted  for  clarity  of  presentation. 


Relative  HasKer  Frequency  Cfm-fs>/fs 
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cun.es  has  been  equated.  For  the  signal  delays  of  5  and  4f> 
ms.  the  corresponding  signal  levels  are  40  8  and  32.3  dB 
SPL.  In  keeping  with  the  data  of  Nelson  and  Freyman 
( 1984).  the  differences  among  these  curves  are  largely  re¬ 
duced 


E.  Analyses  with  fewer  parameters 

Earlier  data  of  Lutfi  ( 1984)  were  described  well  by  as¬ 
suming  a  A  of  0.  The  value  of  C  was  then  given  as  the  dB 
difference  between  thresholds  in  on-frequency  ( g  =  0)  for¬ 
ward  and  on-frequency  simultaneous  masking.  Specifically, 
£  =  G  —  H(0),  where,  G  is  the  on-frequency,  forward- 
masked  threshold.  Substituting  into  Eq.  (2),  the  earlier 
model’s  predictions  were  given  by  the  relation 

F(g)  =//(g)  +  [G-H(O)}.  (3) 

Note  that  this  model  contains  no  free  parameters.  In  terms  of 
Eq.  (2),  A  is  a  predetermined  constant  and  £  is  extracted 
directly  from  the  data.  The  fixed-parameter  model  predicts 
that  for  a  fixed-level  masker,  the  time  decay  of  masking  in  dB 
is  the  same  for  all  masker  frequencies  and  is  given  by  the 
difference  G  —  H{ 0). 

In  this  section,  the  earlier  model  and  two  modifications 
are  applied  to  the  present  data.  In  the  first  version,  the  time 
decay  of  masking  is  assumed  to  be  constant  as  before,  but  no 
specific  relation  is  assumed  between  masker  level  and  the 


amount  of  decay  This  model's  predictions  are  given  hv 

¥(g)  =  -  -T.  (4: 

where  «  is  allowed  to  vary  as  a  free  parameter .  In  the  second 
version,  the  time  decay  of  masking  is  assumed  to  be  constant 
w  ith  a  shift  in  frequency  equal  to  the  shift  in  the  MMF  1  ms 
model  is  similar  to  the  two  parameter  model  with  the  excep¬ 
tion  that  the  shift  in  the  MMF  was  estimated  by  allowing  /. 
to  vary  with  the  constraint  C  =  G  —  H(A).  The  variable-/, 
model’s  predictions  are  given  by  the  relation 

F(g)=mg+A)  +  {G-H{A)\.  (5) 

The  results  of  the  regressions  for  all  three  models  are 
shown  in  Table  V.  The  proportion  of  variance  accounted  for 
by  four-parameter  fits  to  the  data  is  included  for  compari¬ 
son.  The  fixed-parameter  model  clearly  does  a  poor  job  of 
summarizing  the  data.  On  average,  the  proportion  of  vari¬ 
ance  accounted  for  is  over  18%  less  than  when  all  four  pa¬ 
rameters  are  allowed  to  vary.  The  vanable-f  model  does  only 
slightly  better.  On  average,  the  proportion  of  variance  it  ac¬ 
counts  for  is  14%  less  than  when  all  four  parameters  are 
allowed  to  vary.  Only  the  variable-/!  model  provides  a  com¬ 
paratively  good  fit  to  the  data.  It  misses  on  average  less  than 
3%  of  the  total  variability  accounted  for  by  the  four-param¬ 
eter  model.  Figure  8  summarizes,  for  all  conditions  of  this 
study,  the  forward-masked  thresholds  and  the  correspond¬ 
ing  predictions  of  the  variable-/,  model.  The  dashed  lines 
correspond  to  points  of  equality  between  the  obtained  and 


TABLE  V  Parameters  for  forward  masking  filter  functions  resulting  from  three  assumed  models  ( see  test  fora  full  explanation)  The  regression  in  each  case 
was  performed  on  the  mean  data  Bottom  rows  give  the  proportion  variance  accounted  for  by  each  model  and  the  corresponding  rms  error  at  each  signal 
frequency.  The  regression  results  for  the  four-parameter  fits  are  included  for  comparison.  The  difference  between  the  a  values  obtained  in  simultaneous  and 
forward  masking  for  the  four-parameter  fit  provides  an  independent  estimate  of  the  frequency  shift  /.. 


Model 

/, 

t 

Masker  level 

Fixed-parameter 

Vanable-2 

Variable-/ 

Four-parameter 

Hr 

ms 

dB  SPL 

A 

A 

s 

A 

> 

a  =a,  —  a. 

500 

5 

80 

0  000 

-  10  1 

0  000 

-  S  93 

0045 

-  10.1 

0  045 

70 

0.000 

-  7  60 

0000 

-  6  10  - 

0005 

-  7  60 

0  040 

60 

0  000 

-  5.10 

0.000 

-  3  68 

0  037 

-  5.10 

0059 

50 

0  000 

-  2.60 

0.000 

-  1  61 

0  042 

-  2.60 

-0004 

1000 

5 

SO 

0.000 

-  17  4 

0  000 

-  15  7 

0022 

-  17.4 

0.019 

2000 

5 

30 

0.000 

-  19  6 

0000 

-  18.6 

0.052 

-  19.6 

0.056 

70 

0000 

-  15.6 

0.000 

-  14.8 

0055 

-  15.6 

0.062 

60 

0  000 

-  116 

0.000 

-  9.75 

0.065 

-  116 

0  069 

50 

0000 

-  7.60 

0  000 

-  7.73 

0.055 

-  7,60 

0  062 

10 

SO 

0  000 

-  22.1 

0  000 

-  24.0 

0037 

-  22.1 

0.030 

70 

0.000 

-  17  4 

0.000 

-  17  6 

0.045 

-  1  .-.4 

0030 

60 

0.000 

-  12.7 

0000 

-  14  | 

0.050 

-  12  7 

0  035 

20 

SO 

0.000 

-  26  0 

0000 

-  26.3 

0.007 

-  26.9 

0.057 

70 

0  000 

-  21.7 

0000 

-  20.3 

0.065 

-  21.7 

0  066 

60 

0  000 

-  16.5 

0  000 

-  15.0 

0  067 

-  16.5 

0  049 

40 

80 

0  000 

-  34  7 

0.000 

-  32  4  - 

0007 

-  34  7 

'0 

0.000 

-  27.7 

0  000 

-  26  4 

0025 

-  27.7 

60 
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0  000 
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0  060 

-  207 
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0924 
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0  960 

0 
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variance 

0861 
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2000 
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0616 
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04 

04 
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Predicted  Threshold*  dB  SPL 

FIG  8.  Predictions  of  the  variable-/!  model.  Forward-masked  thresholds 
are  shown  for  all  conditions  of  this  study.  The  dashed  lines  correspond  to 
points  of  equality  between  the  obtained  and  the  predicted  thresholds.  They 
have  been  displaced  from  one  another  by  10  dB  to  allow  comparisons  at 
each  masker  level  (parameter).  The  different  symbols  denote  the  different 
signal  delay  masker  frequencies  appear  as  repeated  symbols  for  each  masker 
level  and  signal  delays;  combination.  For  the  5-ms  delay,  the  signal  frequen¬ 
cies  were  500  (crosses),  1000  (inverted  triangles),  and  2000  (squares)  Hz. 
In  all  other  cases,  the  signal  frequency  is  2000  Hz. 

the  predicted  thresholds.  They  have  been  displaced  from  one 
another  by  10  dB  to  allow  comparisons  at  each  masker  level. 
The  different  symbols  denote  signal  delay;  masker  frequen¬ 
cies  and  signal  frequencies  appear  as  repeated  symbols  for 
each  masker  level-signal  delay  combination.  The  correspon¬ 
dence  between  the  obtained  and  predicted  thresholds  is 


good  Overall  the  one-parameter,  variable-.',  model  ac¬ 
counts,  respectively,  for97<T.97fr,and91‘~r  of  the  variance 
in  the  forward-masked  thresholds  for  the  0.5-.  1  0-.  and  2.0- 
kHz  signals  The  corresponding  rms  error  is  1.3,  0  4.  and  2.0 
dB 

III.  DISCUSSION 

A.  The  interaction  between  masker  level  and  signal 
delay 

It  has  long  been  known  that  the  time  decay  of  masking  is 
related  to  the  overall  level  of  the  masker  (Gardner,  1947; 
Hams  et  al.,  1951;  Plomp,  1964;  Zwislocki  el  al.,  1959). 
Perhaps,  the  most  comprehensive  data  on  this  relation  are 
those  of  Jesteadt  et  al.  (  1982).  These  authors  have  shown 
that  the  family  of  functions  relating  signal  threshold  to  sig¬ 
nal  delay  at  each  masker  level  converge  at  a  common  delay. 
The  rate  of  masking  decay  is  greater  at  high  masker  levels 
than  at  low.  Widin  and  Viemeister  (1979r)  interpret  the 
level-delay  interaction  to  reflect  the  dependence  of  masking 
decay  on  the  overall  level  of  the  masker.  However,  an  alter¬ 
native  interpretation  is  that  the  interaction  reflects  the  de¬ 
pendence  of  masking  decay  on  the  initial  amount  of  masking 
(Moore  and  Glasberg,  1981 ).  In  the  earlier  studies,  as  well 
as  the  study  by  Jesteadt  et  al.,  the  overall  level  of  the  masker 
is  correlated  with  the  initial  amount  of  masking  at  some 
short  delay. 

The  data  of  the  present  study  bear  on  this  issue.  For  the 
filter  function,  the  initial  amount  of  masking  (simultaneous 
masking)  decreases  with  increasing  frequency  separation 
between  signal  and  masker,  but  the  overall  level  of  the  mask¬ 
er  remains  constant.  Thus,  if  masking  decay  depends  on  the 
initial  amount  of  masking,  simultaneous-  and  forward- 
masking  filter  functions  should  be  nonparallel;  they  should 
be  parallel  if  masking  decay  depends  on  the  overall  level  of 
the  masker.  The  present  data  tend  to  support  the  conclusion 
of  Widin  and  Viemeister  that  masking  decay  depends  on  the 
overall  level  of  the  masker.  Based  on  their  own  data,  how¬ 
ever,  Moore  and  Glasberg  (1981)  came  to  the  opposite  con¬ 
clusion.  They  show  filter  functions  (their  Fig.  8),  obtained 


FIG.  9.  Left  panel:  simultaneous-  (circles) 
and  forward-  (triangles)  masked  thresholds 
from  the  study  of  Moore  and  Glasberg 
( 1981),  subject  IB.  Signal  duration  is  5  ms. 
Right  panel:  simultaneous-  and  foward- 
masked  thresholds  from  the  study  of  Moore 
et  al.  ( 1987).  In  each  panel,  the  dashed  line 
gives  the  prediction  of  the  variable-/!  model 
for  4  equal  to  0. 
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using  a  notched-noise  masker,  which  are  slightly  sharper  in 
.forward  masking. 

Part  of  the  reason  for  this  apparent  discrepancy  has  to 
do  with  the  way  in  which  Moore  and  Glasberg  present  their 
da-u.  The  forward-masked  thresholds  shown  in  their  Fig.  8 
are  not  the  actual  masked  thresholds,  but  rather  a  transfor¬ 
mation  of  these  thresholds  based  on  a  broadband  noise 
masking  condition.  We  have  performed  the  inverse  transfor¬ 
mation  to  allow  comparison  between  the  simultaneous-  and 
forward-masked  thresholds  actually  obtained.  The  data 
shown  in  Fig.  9  are  from  a  representative  subject  (IB).  The 
dashed  line  gives  the  prediction  of  Eq.  ( 5 )  with  A  equal  to  0; 
in  this  case.  H  is  equivalent  to  the  curve  drawn  through  the 
simultaneous-masked  thresholds.  Quiet  threshold  were  not 
reported  for  these  subjects;  therefore,  the  horizontal  portion 
of  predicted  curve  corresponds  to  an  estimate  of  quiet 
threshold.  The  obtained  filter  function  in  forward  masking 
does  appear  to  be  shallower  than  that  predicted  by  Eq.  (5). 
The  discrepancy  is  so  small,  however,  that  it  is  not  possible  to 
decide  based  on  these  data  alone  whether  the  decay  of  mask¬ 
ing  ultimately  depends  on  overall  masker  level  or  on  the 
initial  amount  of  masking.  A  stronger  test  requires  measur¬ 
ing  the  forward-masking  filter  functions  at  longer  signal  de¬ 
lays.  Moore  and  Glasberg  (1981)  report  a  progressive 
broadening  of  forward-masking  filter  functions  with  signal 
duration  that  is  related  to  signal  delay.  Unfortunately,  Eq. 

( 5 )  cannot  be  applied  to  these  data  as  the  corresponding 
simultaneous-masking  filter  functions  are  missing.  Equation 
(5)  can,  however,  be  applied  to  recent  data  of  Moore  et  al. 
(1987)  where  simultaneous-  and  forward-masking  filter 
functions  are  reported  for  a  longer  duration  signal.  The  data 
from  a  representative  subject  (FL)  and  the  prediction  ofEq. 

( 5 )  are  shown  in  the  second  panel  of  Fig.  9.  The  filter  func¬ 
tions.  which  were  obtained  using  a  20-ms  signal,  are  shal- 
i  lower  than  those  shown  in  the  first  panel  of  this  figure  where 
the  signal  duration  is  5  ms.  This  result  is  consistent  with  the 
data  of  Moore  and  Glasberg.  Once  again,  the  deviation  from 
the  predicted  curve  is  quite  small. 

Although  the  present  data  might  appear  to  support  the 
:  conclusion  that  masker  level,  not  initial  amount  of  masking, 

I  is  the  critical  variable  in  determing  the  rate  of  masking  de¬ 
cay.  it  is  important  to  note  that  the  two  interpretations  need 
not  be  mutually  exclusive.  Small  deviations  from  parallel 
filter  functions  may  indicate  a  weak  relation  to  initial 
;  amount  of  masking,  although  a  larger  proportion  of  the  vari¬ 
ance  may  be  accounted  for  by  the  relation  to  masker  level, 
j  However  one  chooses  to  interpret  the  interaction,  it  is  clear 
j  from  the  present  as  well  as  the  past  studies  that  the  decay  of 
j  masking  is  generally  greater  at  higher  masker  levels. 

•  B.  Implications  for  interpreting  measures  of  frequency 
I  selectivity 

!  It  is  reasonable  to  suspect  that  certain  procedures  may 
j  yield  differences  between  simultaneous  and  forward  mask- 
!  ing  that  have  little  to  do  with  any  “real”  difference  in  audi¬ 
tory  frequency  selectivity.  This  concern  was  intimated  early 
on  (Widin  and  Viemeister,  1979a;  Wightman  et  al.,  1977). 
The  present  study  underscores  this  concern  inasmuch  as  the 
disparity  between  simultaneous-  and  forward-masking  tun- 
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ing  curves  is  attributed  to  an  interaction  between  masker 
level  and  signal  delay  that  is  peculiar  to  the  tuning  curve 
procedure.  It  is  difficult  to  determine  to  what  extent  the  lev¬ 
el-delay  interaction  might  have  affected  differences  among 
tuning  curves  observed  in  past  studies.  These  studies  have 
not.  in  general,  obtained  all  the  measures  within  a  study  nec¬ 
essary  to  evaluate  the  level-delay  interaction  independently 
of  observed  frequency  effects.  Special  care  is  required,  there¬ 
fore,  in  interpreting  the  previously  observed  differences  be¬ 
tween  simultaneous  and  forward  masking.  This  conclusion 
pertains  as  well  to  the  broadening  of  tuning  curves  observed 
at  longer  signal  delays  (Kidd  and  Feth,  1981;  Nelson  and 
Freyman,  1984).  It  is  known,  for  instance,  that  filter  func¬ 
tions  begin  to  broaden  at  high  masker  levels  (Patterson, 
1971;  Lutfi  and  Patterson,  1984;  Weber,  1977).  Since  high- 
level  maskers  are  typically  required  to  mask  signals  at  long 
delays,  one  may  expect  that  tuning  curves  would  also  broad¬ 
en  at  long  signal  delays.  As  Nelson  and  Freyman  (1984) 
showed,  such  effects  can  be  compensated  for  by  equating 
masker  levels  near  the  tips  of  the  tuning  curves.  Of  course 
such  problems  can  be  avoided  by  fixing  masker  level,  and 
varying  signal  level  to  threshold,  as  in  the  filter  function 
procedure.  This  procedure  has  the  advantage  of  allowing  an 
estimate  of  shape  of  the  auditory  filter  at  each  masker  level 
(e.g.,  Lutfi  and  Patterson,  1984). 

A  related  measure  in  which  masker  level  is  fixed  is  the 
masking  pattern.  The  masking  pattern  differs  from  the  filter 
function  in  that  signal  frequency  rather  than  masker  fre¬ 
quency  is  plotted  along  the  abscissa;  masker  frequency  is 
fixed.  For  the  masking  pattern,  the  level-delay  interaction 
must  be  assessed  by  measuring  the  on-frequency  forward- 
masking  function  G  at  each  signal  frequency.  Although  the 
effects  are  small,  the  on-frequency  forward-masking  func¬ 
tion  can  vary  with  signal  frequency  (e.g.,  Jesteadt  et  al., 
1982).  Thus,  according  to  Eq.  (5),  masking  patterns  need 
not  be  parallel  with  signal  delay,  unlike  the  filter  functions 
(cf.  Lutfi,  1985;  Moore,  1985).  Kidd  and  Feth  (1982)  re¬ 
port  masking  patterns  as  a  function  of  signal  delay  for  a  1.0- 
kHz,  sinusoidal  masker,  and,  indeed,  the  high-frequency 
branches  of  their  masking  patterns  become  more  shallow 
with  increasing  delay.  Unfortunately,  Kidd  and  Feth  do  not 
present  the  corresponding  on-frequency  masking  functions 
that  would  be  required  to  apply  the  predictions  of  Eq.  ( 5 )  to 
their  data. 

C.  Additivity  and  the  MMF 

To  predict  the  forward-masked  thresholds  of  the  pres¬ 
ent  study,  we  required  the  estimation  of  the  parameter  A 
equal  to  the  shift  in  the  MMF  from  simultaneous  to  forward 
masking.  Consider,  however,  the  special  case  in  which  A 
equals  0  ( the  fixed-parameter  model ) .  Then,  Eq.  ( 5 )  may  be 
rewritten  in  the  form. 

F(g)  =  G^  fV(g),  (6) 

where  fV(g)  ~ H{g )  —  //(0)  is  the  dB  attenuation  charac¬ 
teristic  of  the  auditory  filter  (see  Patterson,  1976).  Equation 
(6)  suggests  a  fundamental  property  of  masking.  That  is.  if 
we  identify  G  with  the  time  decay  of  masking  and  W  with  the 
frequency  selectivity  of  the  system,  then  the  implication  of 
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Eq.  (6)  is  that  these  two  processes  are  independent  and  addi¬ 
tive  ( additive,  that  is.  in  dB).  Of  course,  as  previously 
shown,  the  present  data  are  poorly  summarized  when  a  is 
assumed  to  beO.  Therefore,  the  relation  implied  by  Eq.  (6)  is 
either  false  or  some  third  unrelated  factor  is  responsible  for 
the  shift  in  the  MMF  in  forward  masking 

McFadden  (  1986)  has  recently  reviewed  arguments  fa¬ 
voring  the  latter  possibility.  The  most  compelling  interpreta¬ 
tion  attributes  the  shift  in  the  MMF  to  a  basalward  displace¬ 
ment  of  the  traveling-wave  envelope  with  increasing 
stimulus  intensity.  According  to  McFadden.  when  stimulus 
intensity  is  high,  the  region  maximally  "fatigued''  along  the 
cochlear  partition  occurs  basal  to  the  region  maximally  fa¬ 
tigued  when  stimulus  intensity  is  low.  Thus,  at  high  intensi¬ 
ties,  signals  at  frequencies  above  the  masker  would  be  ex¬ 
pected  to  undergo  the  greatest  amount  of  masking. 
McFadden  cites  physiological  evidence  for  a  basalward  dis¬ 
placement  with  intensity  which  lends  credence  to  this  inter¬ 
pretation  (e.g.,  Evans.  1977;  Rhode,  1978;  Russell  and  Sei- 
lick,  1978,  Sachs  and  Abbas,  1974).  He  further  argues  that 
the  greatest  effects  should  be  observed  in  forward  masking, 
as  in  the  present  study,  where  certain  confounding  interac¬ 
tions  between  signal  and  masker  are  eliminated.  Other  inter¬ 
pretations  have  been  proposed  by  Vogten  ( 1978b)  and  Wi- 
din  and  Viemeister  (1979b).  Vogten  describes  the  shift  in 
the  MMF  observed  at  low  stimulus  intensities  in  terms  of 
suppression,  while  Widin  and  Viemeister  emphasize  the  im¬ 
portance  in  forward  masking  of  the  short-term  spectrum  at 
the  offset  of  the  masker.  Of  all  of  these  interpretations,  only 
Vogten's  is  clearly  inconsistent  with  the  additive  relation 
implied  by  Eq.  (6). 


0.  Individual  differences  in  tuning  curves 

The  decision  to  average  the  data  of  the  four  subjects  in 
each  condition  of  this  study  was  based  on  the  similarity 
among  the  individual  filter  functions  in  each  condition.  In 
other  words,  the  thresholds  overall  did  not  differ  significant¬ 
ly  across  subjects.  When  the  data  were  replotted  as  tuning 
curves  (constant  threshold),  however,  differences  among 
subjects  were  greatly  exaggerated,  particularly  on  the  high- 
frequency  side  of  the  tuning  curve,  and  particularly  in  for¬ 
ward  masking.  The  reason  for  this  is  clear.  Consider  the 
slope  values  of  the  masking  functions  in  Table  III.  The  slope 
values  are  generally  quite  small  for  forward  maskers  with 
frequencies  above  that  of  the  signal  (g>0).  For  instance, 
when  g  —  0.2,  the  slope  of  the  masking  function  for  the  2.0- 
kHz  signal  presented  at  the  5-ms  delay  is  only  0.16.  This 
means  that  a  1.6-dB  difference  in  threshold  in  this  condition 
corresponds  to  a  10-dB  difference  in  masker  level  for  the 
tuning  curve.  Quite  frequently,  such  small  differences  in 
threshold  can  be  expected  to  produce  rather  large  differ¬ 
ences  in  psychophysical  tuning  curves.  Consequently,  spe¬ 
cial  care  may  be  required  in  interpreting  individual  differ¬ 
ences  among  tuning  curves.  This  is  particularly  true  in 
studies  where  tuning  curves  are  reported  for  only  one  or  two 
subjects,  or  where  comparisons  are  made  between  the  tuning 
curves  of  individual  normal-hearing  and  hearing-impaired 
subjects. 


IV.  SUMMARY 

The  major  results  of  this  study  can  be  summarized  as 
follows:  ( 1 )  Simultaneous-  and  forward -masked  thresholds 
are  described  well  by  parallel  filter  functions;  ( 2 )  maximum 
forward  masking  typically  occurs  a;  a  frequency  below  that 
of  maximum  simultaneous  masking;  ( 3 )  except,  perhaps,  at 
low  signal  frequencies,  forward-masking  tuning  curses  are 
narrower  than  simultaneous-masking  tuning  curves,  even  at 
high  signal  levels,  (4)  differences  among  forward-masking 
tuning  curves  are  largely  eliminated  when  signal-level,  sig¬ 
nal-delay  combinations  are  chosen  to  equate  masker  levels 
near  the  tips  of  the  tuning  curves;  (5)  a  single  frequency- 
selective  function  estimated  exclusively  from  the  simulta¬ 
neous-masked  thresholds  can  be  used  to  predict  results  ( 1  )- 
(4).  The  results  imply  that  a  single  frequency-selective  pro¬ 
cess  can  account  for  commonly  observed  differences  be¬ 
tween  simultaneous-  and  forward -masked  measures  of  fre¬ 
quency  selectivity. 
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'The  function  H  is  actually  proportional  to  the  integral  of  the  auditory  filter 
characteristic  evaluated  over  the  frequency  band  of  the  masker.  However, 
because  the  frequency  band  of  the  masker  is  relatively  small,  direct  com¬ 
parisons  between  the  bandwidths  of  H  and  the  filter  characteristic  are  per¬ 
missible. 
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ABSTRACT 

The  present  study  was  conducted  to  determine  to  what  extent  the  combined 
effect  of  two  forward  maskers  can  be  predicted  from  addition  of  their  individual 
effects.  The  maskers  were  50-Hz  wide  noise  bands  with  center  frequencies  ranging 
from  1.8  to  2.2  kHz.  The  signal  was  a  brief,  2.0-kHz  tone  burst.  When  the 
maskers  were  gated  on  and  off  together,  the  combination  produced  sometimes  more 
and  sometimes  less  masking  than  predicted  depending  on  the  particular  pair  and  the 
relative  amounts  of  masking  proauced  by  the  individual  members  of  the  pair.  The 
greatest  discrepancy  occured,  however,  when  the  masker  pair  was  presented 
simultaneously  with  the  signal  or  when  the  forward  maskers  were  presented  in 
sequence.  In  the  latter  case,  the  obtained  threshold  exceeded  the  predicted  threshold 
by  as  much  as  34  dB. 

Keywords:  Forward  masking,  additivity 


INTRODUCTION 

Over  the  last  several  years,  there  has  been  renewed  interest  in  masking  by 
pairs  of  ’equated’  maskers;  maskers  which  when  presented  separately  produce 
equal  amounts  of  masking  of  a  common  signal.  Early  studies  revealed  that  the 
masking  produced  by  such  pairs  often  exceeds  the  simple  sum  of  the 
masking  produced  by  the  individual  members  of  the  pair  (Bilger,  1959;  Green, 
1967;  Pollack,  1964).  In  the  study,  by  Pollack,  this  excess  masking  (beyond  that 
predicted  by  simple  summation)  amounted  to  as  much  as  19  dB.  The  more  recent 
studies  reveal  the  excess  masking  effect  to  be  wide  spread.  Large  amounts  of 
excess  masking  have  now  been  obtained  for  various  pairs  of  sequential 
forward  maskers  (Hanna  et  al.,  1982;  Penner,  1980;  Penner  and  Shiffrin,  1980; 
Widin  and  Viemeister,  1980),  forward  and  simultaneous  maskers  (Jesteadt  et  al., 
1982;  Jesteadt  and  Wilke,  1982),  forward  and  backward  maskers  (Patterson, 
1971;  Penner,  1980;  Robinson  and  Pollack,  1973;  Wilson  and  Carhart,  1971), 
and  pairs  of  simultaneous  maskers  (Canahl,  1971;  Lutfi,  1983;  Moore,  1985; 
Nelson,  1979;  Patterson  and  Nimmo-Smith,  1980,  Zwicker,  1954).  Indeed,  only  pairs 
of  concurrent,  forwards  maskers  have  so  far  been  found  not  to  produce  any  excess 
masking  (Jesteadt  and  Wilke,  1982;  Neff  and  Jesteadt,  1983).  Such  results 
challenge  the  traditional  view  of  masking  which  assumes  that  the  effects  of 
maskers  are  additive  (Fletcher,  1940;  Patterson  and  Green,  1978).  They  have  lead 
to  a  different  class  of  models  in  which  the  effects  of  maskers  undergo  a 
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compressive  nonlinear  transformation  prior  to  summation  (Penner,  1980;  Penner 
and  Shiffrin,  1980;  Lutfi,  1983,  1935). 

A  strong  assumption  of  these  nonlinear  models  is  that  the  amount  of  excess 
masking  depends  only  on  the  relative  effects  of  maskers  in  the  pair,  not  on 
their  particular  temporal  or  spectral  configuration  (1).  The  assumption  reflects 
the  early  stage  of  development  cf  these  models.  For  instance,  a  basic  difference 
in  excess  masking  does  appear  to  exist  for  simultaneous  versus  nonsimultaneous 
masker  pairs  (cf,  Penner,  1980;  Lutfi,  1983).  There  is  also  the  issue  as  to 
why  all  combinations  of  equated  maskers  appear  to  produce  excess  masking  except 
concurrent  forward  maskers.  This  latter  issue  seems  the  more  critical  since  it  would 
be  unparsimonious  to  assert  that,  auditory  transduction  is  nonlinear  except  in  this 
one  case. 

In  light  of  the  results  for  concurrent  forward  maskers,  it  seemed  appropriate 
to  explore  further  the  effects  of  these  maskers.  We  began  by  focusing  on 
combinations  of  maskers  with  frequencies  disparate  from  the  signal  frequency.  In 
the  studies  by  Jesteadt,  Neff  and  Wilke,  the  effective  masker  components  were 
always  at  or  very  nearly  equal  to  the  signal  frequency.  We  also  examined  the 
effects  of  varying  the  relative  amounts  of  masking  produced  by  the  individual 
maskers  in  the  pair.  The  pattern  of  results  to  emerge  from  these  experiments 
was  complex.  Various  maskers  combined  to  produce  significant  amounts  of 
excess  masking,  no  excess  masking,  or  even  a  release  from  masking.  We  next 
examined  the  masking  produced  by  a  pair  of  simultaneous  maskers,  a  pair  of 
sequential  forward  maskers,  and  a  simultaneous-plus-forward  masker.  The  first  two 
pairs  produced  the  largest  amounts  of  excess  masking  observed  in  this  study,  as  much 
as  34  dB  for  the  pair  of  sequential  forward  maskers.  In  contrast,  the  effects  of  the 
simultaneous  and  forward  masker  when  paired  were  additive.  The  data  are  enough 
to  dishearten  those  who  would  propose  such  elegant  models  as  offered  by 
Penner  (1980),  Penner  and  Shiffrin  (1980),  and  Lutfi  (1983). 

I.  METHOD 
A.  Stimuli 

The  signal  in  all  conditions  was  a  10-ms,  2.0-kHz  sinusoid,  gated  on  and 
off  with  5-ms  Kaiser  ramps.  This  ramp  produces  spectral  sidelobes  that  are  more 
than  70  dB  down  from  the  primary  lobe  within  20%  of  the  primary  lobe  center 
frequency  (see  Childers  and  Duriing,  1975).  The  maskers  were  200-ms,  50-Hz  wide 
noise  bands  with  variable  center  frequencies,  they  were  also  gated  on  and  off 
with  5-ms  Kaiser  ramps.  The  long-term,  power-spectra  of  the  noise  bands  had 
skirts  that  fell  over  1000  dB/octave  near  the  passband.  Three  pairs  of  maskers 
were  used.  The  center  frequencies  of  the  maskers  in  each  pair  were  1.8  and  1.9 
kHz,  1.9  and  2.1  kHz,  and  2.1  and  2.2  kHz.  The  maskers  were  gated  on  and  off 
together  with  the  onset  of  the  signal  following  after  a  5-ms  silent  interval. 
All  stimuli  were  generated  digitally  and  output  through  14-bit  DACs.  Each  masker 
was  randomly  sampled  from  a  3-s  file  on  each  presentation.  The  signal  and 
each  masker  in  the  pair  were  played  over  separate  DACs  (10-kHz  rate  for  each 
DAC).  When  only  one  masker  was  presented,  0s  (corresponding  to  0  voltage) 
were  output  through  the  DAC  otherwise  occupied  by  the  second  masker.  The 
output  of  each  DAC  was  low-pass  filtered  at  4.0  kHz,  120  dB/octave  for  each 
masker  and  96  dB/octave  for  the  signal.  After  mixing,  the  stimuli  were 
amplified  and  were  presented  over  TDH-49  headphones  to  the  right  ear  of 
subjects  seated  in  a  double-wal',  IAC  sound-attenuated  chamber. 
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B.  Procedure 

Signal  thresholds  were  obtained  using  a  standard,  two-  interval,  forced- 
choice  adaptive  procedure  (Levitt,  1971).  Each  trial  block  began  with  the  signal 
about  15  dB  above  masked  threshold.  Signal  level  decreased  after  two  consecutive 
correct  responses,  and  it  increased  for  each  incorrect  response.  The  initial  value 
of  the  step  size  was  6  dB;  after  two  reversals  in  the  direction  of  attenuation,  the 
step  size  was  changed  to  2  dE.  The  trial  sequence  ended  for  each  subject 
individually  after  a  total  of  18  r5versals  in  signal  attenuation  had  been  recorded. 
The  first  two  reversals  were  ignored  and  the  levels  of  the  remaining 
reversals  were  averaged  to  obtain  a  threshold  estimate. 

The  relative  amount  of  masking  produced  by  the  individual  maskers  in  each 
pair  was  varied  by  varying  the  relative  levels  of  the  maskers  in  the  pair.  One 
masker  was  always  fixed  at  65  dB  SPL,  while  the  other  varied  from  30  to  80  dB 
SPL  in  10  dB  steps.  An  entire  masking  function  (signal  threshold  versus 
masker  level)  was  obtained  for  a  single  masker  configuration  before  proceeding 
to  the  next.  Typically,  three  masking  functions  were  obtained  within  each  daily 
2-h  session.  After  a  single  threshold  estimate  had  been  obtained  for  all 
conditions  of  the  experiment  a  replication  was  performed. 

C.  Subjects 

Four  university  students  were  paid  at  an  hourly  rate  to  participate  as 
listeners  in  the  study.  The  ages  of  the  subjects  ranged  between  18  and  23  years. 
All  reported  normal  hearing  and  all  had  at  least  10-h  previous  experience  with 
the  adaptive,  two-interval,  forced-choice  task. 

II.  RESULTS 

The  pattern  of  results  was  the  same  for  all  subjects,  therefore  the 
threshold  estimates  for  each ,  condition  were  averaged  across  subjects  and 
replication.  Fig.  1  gives  the  masking  functions  for  all  three  masker  pairs.  In 
each  panel,  unfilled  circles  represent  the  masking  function  for  the  variable-level 
masker  presented  alone  and  filled  triangles  represent  the  masking  function  for 
the  variable-level  masker  in  the  presence  of  the  fixed-level  masker  (vertical  bars 
represent  one  standard  error  on  either  side  of  the  mean).  The  masking  produced 
by  the  fixed- level  masker  alone  is  designated  by  the  dashed  line  in  each  panel.  The 
masking  functions  for  the  single  variable-level  maskers  are  quite  typical  of  those 
obtained  in  the  past  (Egan  and  Hake,  1950;  Vogten,  1978).  They  have  a  slope  of  1 
or  slightly  greater  for  maskers  below  the  signal  frequency,  and  a  slope  slightly  less 
than  one  for  maskers  above  the  signal  frequency. 

Assuming  simple  summation  of  masking,  the  amount  of  masking  produced  by 
the  masker  pairs  should  never  exceed  by  more  than  3  dB  the  amount  of  masking 
produced  by  the  more  effective  member  of  the  pair.  Also,  masking  by  the  pair 
should  never  fall  below  that  of  the  more  effective  member.  Exceptions  to  both 
of  these  predictions  are  evident  in  Fig.  1.  For  example,  when  the  1.9  and  2.1-kHz 
maskers  separately  produce  equivalent  amounts  of  masking  (where  the  circles 
and  dashed  lines  intersect),  the  amount  of  masking  produced  by  the  pair  is 
about  10  dB  greater  than  either  masker  alone.  This  is  equivalent  to  7  dB  of  excess 
masking.  Excess  masking  is  evident  for  the  other  masker  pairs  as  well,  although 
the  amount  of  excess  masking  is  never  as  large.  For  all  masker  pairs,  there 
also  appears  to  be  a  release  from  masking;  the  pair  actually  produces  less  masking 
than  the  more  effective  member  of  the  pair.  The  release  from  masking  is  evident  in 
the  left-hand  portion  of  each  panel  where  the  masking  function  for  the  pair  dips 
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below  that  produced  by  the  fixed-level  masker  (dashed  line).  In  most  cases,  the 
release  from  masking  amounts  to  only  a  few  dB,  however,  at  least  in  one  case 
(fixed  2.2-,  variable  2.1-kHz  masker)  it  amounts  to  over  10  dB. 

The  overall  pattern  of  results  is  made  more  clear  in  Fig.  2.  The  dependent 
variable  in  Fig.  2  is  the  difference  between  the  amount  of  masking  produced  by 
the  pair  and  the  amount  of  masking  produced  by  the  more  effective  member  of 
the  pair.  This  relative  amount  af  masking  is  plotted  as  a  function  of  the 
difference  in  the  amount  of  masking  produced  by  each  masker  individually. 
The  unfilled  circles  are  from  the  condition  wherein  the  lower  frequency  masker  was 
fixed  in  level.  The  filled  squares  are  from  the  condition  wherein  the  higher  frequency 
masker  was  fixed  in  level.  The  solid  line  in  the  middle  of  each  panel  gives  the 
prediction  based  on  simple  additivity  of  masking.  From  Fig.  2  it  is  possible  to  see 
that  excess  masking  results  whenever  the  individual  maskers  in  the  pair  produce 
roughly  equivalent  amounts  of  masking.  The  effect  is  largest  for  the  1.9+2.1-kHz 
pair  and  is  small  or  nonexistent  for  the  1.8+1.9-kHz  pair.  In  contrast  to  the  excess 
masking,  a  release  from  masking  results  whenever  the  individual  maskers  in  the  pair 
produce  largely  discrepant  amounts  of  masking;  again,  the  largest  release  from 
masking  being  obtained  with  the  1.2+2.2-kHz  pair. 

Fig.  3  shows  a  similar  set  of  data  from  a  control  condition  in  simultaneous 
masking.  In  this  condition,  the  maskers  were  identical  to  the  1.8  and  1.9-kHz 
maskers  used  before,  however,  the  signal  was  a  2.0-kHz  sinusoid  which  was  gated 
on  and  off  with  the  maskers  using  5-ms  Kaiser  ramps  as  before.  These  data  are 
consistent  with  the  data  from  numerous  other  studies  which  have  obtained  10  dB  or 
more  excess  masking  for  the  combination  of  two  simultaneous  maskers  (Canahl,  1971; 
Green,  1967;  Lutfi,  1983;  Nebon,  1979;  Patterson  and  Nimmo-Smith,  1980). 

III.  DISCUSSION 

A.  Is  forward  masking  additive? 

The  studies  of  Jesteadt  and  Wilke  (1982)  and  Neff  and  Jesteadt  (1983)  have 
suggested  that  the  effects  of  concurrent  forward  maskers  are  additive.  This  is  an 
important  result  because  many  other  combinations  of  maskers  have  so  far  been  shown 
to  produce  large  nonadditive  effects  in  the  form  of  excess  masking.  The  present  data 
show  that  some  combinations  of  concurrent  forward  maskers  can  produce  nonadditive 
effects,  both  in  the  form  of  excess  masking  and  as  a  release  from  masking.  In  this 
section,  we  consider  possible  explanations  for  the  apparent  discrepancy  between  the 
present  and  past  results. 

Consider  first  the  excess  masking.  In  the  present  study,  significant  amounts  of 
excess  forward  masking  were  obtained  only  for  the  1.9+2. 1  and  2.1+2.2-kHz  masker 
pairs,  and  then  only  when  the  individual  maskers  in  the  pair  produced  nearly 
equivalent  amounts  of  masking.  We  believe  that  a  different  factor  is  responsible  for 
the  excess  masking  in  each  case.  For  the  1.9+2.1-kHz  pair,  a  likely  cause  is  ’off- 
frequency  listening’  (Patterson  ant.  Nimmo-Smith,  1980).  Because  the  ear’s  frequency 
resolving  power  is  finite,  there  is  3ome  spread  of  excitation  across  auditory  frequency 
channels.  Consequently,  the  channels  providing  the  best  signal-to-noise  ratio  may 
sometimes  be  located  off  the  frequency  of  the  signal,  away  from  that  of  the  masker. 
This  form  of  off-frequency  listening  is  restricted  whenever  the  frequencies  of  the 
maskers  bracket  the  signal,  as  is  true  for  the  1.9+2.1-kHz  pair.  The  result  is  that 
the  effect  of  such  maskers  in  combination  may  exceed  the  simple  sum  of  their  effects 
in  isolation.  A  different  interpretation  is  required  for  the  2.1+2.2-kHz  pair  since 
these  maskers  do  not  bracket  the  signal.  In  this  case,  the  excess  masking  could  have 
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resulted  from  masking  by  an  aural  combination  band  generated  at  the  signal 
frequency.  This  combination  band  would  have  been  expected  to  produce  significant 
amounts  of  masking  only  when  the  level  of  the  2.2-kHz  masker  was  equal  to  or 
slightly  below  the  level  of  the  2.1  kHz  masker  (Goldstein,  1967;  Greenwood,  1971).  As 
it  happens,  these  are  exactly  the  circumstances  under  which  the  excess  masking  for 
this  pair  is  observed. 

Consider  next  the  release  from  masking.  At  first,  one  might  be  inclined  to 
attribute  this  effect  to  suppression  of  the  less  intense  masker  by  the  more  intense 
masker  (Houtgast,  1974).  However,  there  are  two  reasons  why  this  interpretation  is 
inadequate.  First,  for  at  least  two  of  the  masker  pairs,  the  masker  frequencies  seem 
too  close  to  yield  measureable  suppression  effects  (see  Shannon,  1976.).  Second,  even 
if  the  less  intense  masker  were  completely  suppressed,  we  should  not  expect  the 
amount  of  masking  to  be  any  less  than  that  of  the  more  intense  masker.  In  other 
words,  there  should  be  no  release  from  masking  except  perhaps  in  the  very  rare 
instance  when  the  less  intense  masker  produced  the  greater  amount  of  masking.  A 
more  likely  interpretation  is  that  the  release  from  masking  reflects  the  use  of  a 
’quality  difference  cue’  between  the  signal  and  the  masker  pair  (Moore,  1980;  Fasti 
and  Bechly,  1981).  Moore  and  others  have  presented  evidence  that  a  brief  signal 
may  be  confused  as  part  of  the  forward  masker  when,  as  in  the  present  experiment, 
the  signal  and  masker  share  a  similar  ’pitch-like’  quality.  Adding  a  second  masker, 
which  itself  produces  relatively  lit.le  masking,  gives  the  overall  masker  a  more  ’noise¬ 
like’  quality,  thereby  lessening  the  chance  of  such  confusions. 

If  one  accepts  the  possibility  of  such  confounding  influences,  then  for  only  one 
condition  of  the  present  study  can  the  results  be  safely  compared  to  those  of  Jesteadt 
and  Wilke  (1982),  and  Neff  and  Jesteadt  (1983).  This  would  be  the  1.8+1.9-kHz 
condition  in  which  the  individual  maskers  produce  nearly  equivalent  amounts  of 
masking.  For  this  condition,  the  effects  of  the  maskers  do  appear  to  be  additive;  if 
not,  the  discrepancy  is  very  small.  Thus,  the  data  for  this  condition,  at  least, 
appear  to  be  consistent  with  the  previous  data  using  concurrent  forward  maskers. 

B.  Excess  masking  as  a  failure  of  waveshape  analysis. 

Barring  any  confounding  interactions  between  maskers,  why  is  it  that  only  the 
effects  of  concurrent  forward  maskers  appear  to  be  additive?  One  explanation  may 
be  made  in  terms  of  waveshape  analysis.  First,  consider  what  happens  when  two 
nonconcurrent  forward  maskers  are  combined.  When  either  masker  is  presented  alone, 
the  signal  will  be  detected  as  a  brief  pertubation  in  waveshape  at  the  end  of  the 
masker.  Adding  a  second  masker,  which  is  separated  from  the  first  in  time,  will 
produce  a  second  perturbation  in  close  temporal  proximity  to  that  produced  by  the 
signal.  This  second  perturbation  may  ’mask’,  be  confused  with,  or  otherwise  interfere 
with  that  produced  by  the  signal  which  in  turn  may  result  in  excess  masking.  The 
situation  is  different  for  concurrent  forward  maskers.  Gating  the  forward  maskers  on 
and  off  together  simply  eliminates  this  second  perturbation,  and  thus,  the  proposed 
cause  of  the  excess  masking.  It  is  of  interest  to  note  that  a  similar  type  of 
waveshape  analysis  has  been  suggested  as  the  cause  of  excess  simultaneous  masking 
(Lutfi,  1986). 

The  foregoing  analysis  is  easily  tested.  If  waveshape  analysis  is  responsible  for 
the  excess  masking  obtained  with  nonconcurrent  maskers,  then  it  should  be  possible 
to  both  minimize  and  maximize  the  excess  masking  by  selecting  masker  combinations 
which  respectively  minimize  and  maximize  the  difficulty  of  waveshape  analysis.  An 
additional  experiment  was  conducted.  The  signal  was  a  20-ms,  2.0-kHz  sinusoidal 
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burst.  The  nonconcurrent  masker  pair  was  a  200-ms  noise  band  immediately  followed 
by  a  20-ms  noise  band  (i.e.  the  offset  of  the  first  noise  band  corresponded  to  the 
onset  of  the  second  noise  band  at  the  zero  voltage  points).  Both  noise  bands  were 
50  Hz  wide  and  both  were  cente~ed  at  2.0-kHz.  All  stimuli  were  gated  on  and  off 
as  before  with  5-ms  Kaiser  rampc.  The  levels  of  the  maskers  were  also  selected  so 
that  each  individually  produced  „he  same  amount  of  masking.  The  first  panel  of 
Fig.  4  shows  the  thresholds  fcr  two  subjects  when  the  signal  was  presented 
immediately  following  the  20-ms  masker.  The  dashed  line  gives  the  prediction 
assuming  additivity  of  masking.  Note  from  the  insert  that  the  masker  and 
signal -}-masker  waveshapes  are  clearly  discriminable  when  either  masker  is  presented 
alone.  When  the  two  maskers  are  combined,  however,  the  difference  in  waveshapes  is 
much  less  clear  and  the  signal  may  be  perceived  as  a  continuation  of  the  masker. 
As  much  as  34  dB  of  excess  masking  is  obtained  in  this  condition,  a  record  amount 
for  the  combination  of  these  types  of  maskers.  Perhaps,  more  interesting  is  that 
when  the  signal  is  presented  simultaneously  with  the  20-ms  masker  (second  panel), 
the  excess  masking  is  essentially  eliminated.  In  this  case,  whether  the  maskers  are 
presented  individually  or  together,  the  masker  and  signal+masker  waveshapes  are 
never  particularly  easy  to  discriminate. 

C.  Summary 

Recent  models  have  been  successful  in  describing  the  masking  produced  by 
various  masker  pairs  based  only  cn  the  relative  amount  of  masking  produced  by  each 
member  of  the  pair  (Jesteadt,  1383;  Lutfi,  1983,  1985;  Penner,  1980;  Penner  and 
Shiffrin,  1980).  The  present  study  reveals  several  exceptions  to  the  predictions  of 
these  models.  The  masking  produced  by  pairs  of  concurrent  forward  maskers  is  found 
to  depend  not  only  on  the  relative  amounts  of  masking,  but,  also  on  the  particular 
combination  of  masker  frequencies  chosen.  The  pattern  of  results  is  complex 
suggesting  that  several  different  processes  may  have  been  involved.  Among  these  are 
off- frequency  listening,  cueing  effects,  and  masking  by  aural  distortion  products.  Even 
when  such  factors  can  be  ruled  out,  there  is  still  found  to  be  a  large  discrepancy  in 
the  amount  of  combined  masking  produced  by  equated  pairs  of  concurrent  versus 
sequential  forward  maskers.  It  may  be  that  many  of  the  results  obtained  using 
combinations  of  forward  maskers  can  be  accounted  for  in  terms  of  differences  among 
stimulus  waveshapes. 
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FOOTNOTE 

(I)  This  assumption  is  contigent  upon  the  condition  that  the  effective  components  of 
the  maskers  are  separated  either  in  frequency  or  in  time. 
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FIGURE  CAPTIONS 


Fig.  1  Masking  functions  for  the  three  pairs  of  concurrent  forward  maskers.  Unfilled 
circles  give  mean  thresholds  (4  subjects)  in  the  presence  of  the  variable-level 
masker  alone.  Vertical  bars  represent  one  standard  error  on  either  side  of 
the  mean.  The  dashed  line  denotes  the  threshold  in  the  presence  of  the 
fixed-level  (65  dB  SPL)  masker  alone.  Filled  triangles  give  thresholds  when 
the  variable-  and  fixed-level  maskers  are  combined. 

Fig.  2  The  ordinate  gives,  for  each  masker  pair,  the  amount  of  masking  above  or 
below  that  produced  by  the  most  effective  member  in  the  pair.  The 
abscissa  gives  the  difference  in  the  amount  of  masking  produced  by  the 
individual  members  of  the  pair.  Unfilled  circles  denote  that  the  fixed-level 
masker  was  the  lower  frequency  masker.  Filled  squares  denote  that  the 
fixed-level  masker  was  the  higher-frequency  masker.  The  solid  line  in  the 
center  of  each  panel  gives  the  prediction  based  on  simple  additivity  of 
masking. 

Fig.  3  Same  as  Fig.  2,  except  the  two  maskers  were  presented  simultaneously  with 
the  signal. 

Fig.  4  Panel  A:  Signal  threshold  in  the  presence  of  a  pair  of  sequential  forward 
maskers  is  plotted  as  a  function  of  the  threshold  in  the  presence  of  either 
masker  alone  (different  symbols  represent  data  from  two  subjects).  The 
dashed  line  gives  the  prediction  based  on  additivity  of  masking.  Panel  B: 
Same  as  panel  A,  except  the  signal  was  presented  simultaneously  with  the 
trialing  masker.  The  insert  in  each  panel  shows  the  waveshapes  of  the 
signal  and  both  maskers  combined.  The  cross-hatched  region  denotes  the 
signal. 
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